AN INTEODUCTION TO THE 


THEORY OF STATISTICS. 


BY 

G. UDNY YULE, C.B.E., M A., P.R.S. 

FELLOW OF ST JOHN’S COLLEGE, AND 
FOEMEBLY HEADER DST STATISTICS, CAMBRIDGE , 

FELLOW OF UNIVERSITY COLLEGE, LONDON , 

HONORARY VICE-PRESIDENT OP THE ROYAL STATISTICAL 
SOCIETY OF LONDON; ^ 

HONORARY MEMBER OF THE AMERICAN STATISTICAL ASSOCIATION , 
MEMBER OF THE INTERNATIONAL STATISTICAL INSTITUTE , 

FELLOW OP THE ROYAL ANTHROPOLOGICAL INSTITUTE , 
HONORARY MEMBER, OF THE STAIE STATISTICAL COUNCU^ OF THE 
C7.ECH0SL0VAK REPUBLIC 


Mitb 53 fffgures anb ^ 



TENTH EDlTimfl^EVjfSJ ^\^ 


LONDON- 

CHAELES GEIFFIN AND COMPANY, LIMITED, 

42 DRURY LANE, W C. 2. 

1932 

[All Ihghts Reserved,'] 



Printed in Cieat Bntniu by 
Nmit & Co , Lid,, Edinburch. 



PfiBFACE TO THE TENTH EDITION. 


That a new edition of An Introduction to the Theory of Statistics 
should again be called for within three years of .the issue of 
the last is satisfactory evidence that the work continues to hold 
its own against its numerous younger competitors At the linae 
of going to press negotiations are m progress for the issue of a 
Spanish edition. 

The attention of the student is directed to the Supplements 
at the end of the book, in which, to save expense in revigion, all 
new matter has been incorporated. In particular, Supplement 
IL gives the direct proof of the formulse for regressions, which, 
for the student with some knowledge of differential calculus, 
will be preferable to the indirect deduction of Chap IX. Supple- 
ments III. and IV. deal with important subjects not covered in 
the body of the work. The additional references on pp. ^90 
et seq. have been revised to date for the present edition, but 
readers must bear in mind that this revision is necessarily 
closed some months before the book finally goes to press. 
With the growth of the * literature ’ bibliography becomes 
more and more difficult and laborious : the time almost seems 
to have come for the publication' of some periodical index 
giving brief abstracts of papers and short notices of books. 

All new matter in the present edition has been duly incor- 
porated in the index, which has been revised extensively. 

The present edition marks an epoch both for the author and 
the book. At the end of the last ac^idemic year I resigned the 
teaching post which I had held since 1912, feeling that the work 
now calls for a younger man and a better qualified mathe- 
matician. As for the book, it has now come of age, the first 





PEBFACB. 


edition tavmg been published m 19il Over 10,000 copies in 
all have been sold, the success of the work having far exceeded 
expectations One element at least in that success, m my 
belief, IS the fact that the book was definitely founded on experi- 
ence, personal experience in statistical work and personal 
experience in teaching ; and the same may be said of later 
additions and revisions. I make this statement partly m 
self-defence Correspondents have frequently requested me 
to include chapters on subjects, e g, technical applications of 
statistical method, of which I know nothing and have no 
personal experience my reply has always been a refusal. 

But the success of a book is certainly due to its publishers as 
well as its author, and I am glad to give them my acknow- 
ledgments it IS very pleasant to look back on our friendly 
relations over so many years To my readers also let me take 
this opportunity of sending my greetings and thanks for their 
kindly judgment 

G. V, y. 

May 1932 



PREFACE TO THE FIRST FDHION. 


The following chapters are based on the courses of instruction 
given during my tenure of the Newmarch Lectureship in Statistics 
at University College, London, in the sessions 1902-1909 The 
variety of illustrations and examples has, however, been increased 
to render the book more suitable for the use of biologists and 
others besides those interested in economic and vitaU statistics, 
and some of the more difficult parts of the subject have been 
treated in greater detail than was possible in a sessional course 
of some thirty lectures For the rest, the chapters follow closely 
the arrangement of the course, the three parts into which the 
volume is divided corresponding approximately to the work of 
the three terms To enable the student to proceed further with 
the subject, fairly detailed lists of references to the original 
memoirs have been given at the end of each chapter : exercises 
have also been added for the benefit, more especially, of the 
student who is working without the assistance of a teacher. 

The volume represents an attempt to work out a systematic 
introductory course on statistical methods — the methods available 
for discussing, as distinct from collecting, statistical data — suited 
to those who possess only a limited knowledge of mathematics : 
an acquaintance with algebra up to the binomial theorem, 
together with such elements of co-ordmate geometry as are now 
generally included therewith, is all that is assumed. I hope that 
it may prove of some service to the students of the diverse 
sciences in which statistical methods are now employed. 

My most grateful thanks are due to Mr R. H. Hooker not only 

vn 



viil 


pkeiaoe. 


for reading the greater part of the manuscript, and the proofs, 
and for making many criticisms and suggestions which have 
been of tlie gieatest service, but also for much friendly help and 
encouragement without which the preparation of the volume, 
often delayed and interrupted by the pressure of other work, 
might never have been completed . my debt to Mr Hooker is 
indeed greater than can well be expressed m a formal preface. 
My thanks are also due to Mr H. D. Vigor for some assistance 
in checking the arithmetic, and my acknowledgments to Professor 
Edgeworth for the example used in § 5 of Chap XYII. to illustrate 
the influence of the form of the frequency distribution on the 
probable error of the median 

I can hardly hope that all errors in the text or in the mass 
of arithmetic involved in examples and exercises have been 
eliminated, and will feel indebted to any reader who directs 
my attention to any such mistakes, or to any omissions, am- 
biguities, or obscurities. 

G. U. Y 

Decembir 1910 , 



CONTEJSTTS. 


INTRODUCTION. 

I -3 The introduction of the terras ‘^statistics,” statistical,” into 
the English language— -4-6 The change in meaning of these 
terms during the nineteenth century — 7~9. The present use 
of the terms— 10 Definitions of “statistics,” “statistical 
methods,” “ theory of statistics,” in accoi dance with present 
usage 


PART I— THE THEORY OF ATTRIBUTES 
CHAPTER I. 

NOTATION AND TERmNOLOOY. 

1 - 3 . Statistics of attnhutes and statistics of variables , fundamental 
character of the former — 3-6. Classification by dichotomy-^ 
6~7 Notation for single attnhutes and for combinations — 

8 The class-freimency— 9 Positive and ne^tive attnhutes, 
contraries— 10 The order of a class — 11. The aggiegate — 

12 The ariangemont of classes by order and aggregate — 
13-14. Sufficiency of the tabulation of tlie ultimate class- 
frequencies — 16-17 Or, better, of the positive class fre- 
quencies — 18 The class frequencies chosen in the census 
for tabulation of statistics of infirmities —19 Inclusive, and 
exclusive notations and terminologies . . , , , 

CHAPTER IL 
CONSISTENCE 

1-8 The field of observation or universe, and its specification by 
symbols — 4 Benvation of complex from simple relations by 
sjpecifying the universe — 6-6, Consistence — 7-10. Con- 
ditions of consistence for one and for two attributes — 
11 - 14 . Conditions of consistence for three attributes . 


PAOES 


1-6 


7-16 


17-24 


IX 



.X 


CONTENTS, 


CHAPTER III. 

ASSOCIATION. 

PAGES 

1-4. The criterion of independence — 0-10 The conception of 
association, and testing for the same by the eoinparison 
of percentages— 11-12, Numeiical equality of the dUlerences 
between the four second order fiequciicics and their in- 
dependence values— 18 Ooefiioients of association — 14. 

Necessity for an investigation into the causation of an 
attribute A being extended to include non A*b . « . 25-41 

CHAPTER IV. 

PARTIAL ASSOCIATION. 

1-2 ^Uncertainty in interjnetation of an observed association — 3-6 
Source of the ambiguity partial associations — 6-8 Illusory 
association due to the association of each of two attributes 
with a third — 9. Estimation of the partial associations from 
the fiequenoies of the second order— 10-12, The total 
number of associations for a given number of attributes— 

13--14. The case of complete independence .... 42-69 

CHAPTER V. 

MANIFOLD CLASSIFICATION. 

1. The general principle of a manifold classification— 2-4 The 
table of double entry or contingency table and its treatment 
by fundamental methods— 5-8. The coeflBcient of contin- 
gency— 9-10 analysis of a contingency table by tetrads 
— 11-13 Isotropic and anisotropic distnbutions— 14-16. 
Homogeneity of the classifications dealt with in the pre- 
ceding chapters . heterogeneous classifications . . . 60-74 


PART II~THE THEORY OF VARIABLES. 
CHAPTER VI. 

THE FREaUENCY-DISTRIBUTION. 

1. Introductory— 2. Necessity foi classification of observations ; the 
frequency distnbution— 8. Illustrations— 4 Method of form- 
ing the table — 6 Magnitude of class-intervals— 6 Position 
of intervals— 7 Process of classification— 8. Treatment of 
intermediate observations— 9. Tabulation— 10. Tablc.s with 
unequal intervals — 11. Graphical representation of the 
frequency-distribution — 12 Ideal frequency-distributions— 

13. The symmetrical distnbution— 14, The moderately 
asymmetrical distribution — 16. The extiemoly asymmetri- 
cal or J-shaped distribution— 16. The U-shaped distribution 76-106 



CONTENTS, 


CHAPTER VII. 

AVERAGES. 

Txats 

1. Necessity for quantitatiye definition of the characters of a 
frequency di&tnbutioii— 2, Measuies of position (averages) 
and of dispersion — 3 The dimensions ot an average the 
same as those of the vanable~4 Desirable propeities for 
an average to possess — 5 The commoner forms of aveiage — 

6-13 The anthmetic mean its definition, calculation, and 
simpler properties— 14-18 The median • its definition, 
calculation, and simpler properties — 19-20 The mode : its 
definition and relation to mean and median — 21. Summary 
comparison of the preceding forms of average — 22-26 The 
geometiic mean its definition, simpler properties, and the 
oases in which it is specially applicable — 27. The harmonic 
mean: its definition and calculation 106-132 


CHAPTER VIII. 

MEASURES OF DISPERSION, ETC. 

1 Inadequacy of the lange as a measure of dispersion — 

2-13. The standaid deviation its definition, calculation, 
and properties — 14-19 The moan deviation . its definition, 
calculation, and properties— 20-24 The quartile deviation 
or senu-interquaitilo range— 25 Measures of relative dis- 
persion — 26 Measui'es of asymmetry or skewness — 27-80 
The method of grades or percentiles ..... 133-156 


CHAPTER IX. 

CORRELATION. 

1-8. The correlation table and its formation — 4-5 The correlation 
surface — 6-7 The general problem — 8-9 The line of means 
of rows and the line of means of columns their lelative 
positions in the case of independence and of varying degrees 
of correlation — 10-14 The correlation -coefficient^ the re- 
gressions, and the standard deviations of arrays— 15-16. 

Numerical calculations — 17 Certain points to be re- 
membered in calculating and using the coefficient . . 167-190 

CHAPTER X. 

CORRELATION: ILLUSTRATIONS AND PRACTICAL 
METHODS. 

1. Necessity for careful choice of variables before proceeding to 
calculate r — 2-8. Illustration i Causation of paupei ism — 

9-10. Illustration u. . Inheritance of fertility—11-13 



xii OOOTEHm 

Illustration iii: The weather and the crops-— 14, Corre- 
lation between the movements of two variables, (a) 
Kon-penodio movements Illustration iv.. changes in 
infantile and general mortality— 16-17 (6) Quasi -periodic 

movements Illustration v. : the mamage-rate and foreign 
trade— 18, Elementary methods of dealing with cases of 
non-linear regression— 19. Ceitain rough methods of approxi- 
mating to the correlation-coefficient — 20-22 The correla- 
tion-ratio , . 191-209 


CHAPTER XI. 

MISCELLANEOUS THEOREMS INVOLVING THE USE OF 
THE OORRELATION-COEFFIOIENT. . 

1 Introductory — 2 Standard-deviation of a sum or difference— 

3-5, Influence of errors of observation and of giouping on the 
standard-deviation— 6-7. Influence of eirors of observation 
on the correlation -coefficient (Spearman’s theorems) — 8. 

Mean and standard-deviation of an index— 9. Correlation 
between indices— 1 0 Oorrelation-coeffibient for a two x two- 
fold table— 11. Coi relation-coefficient for all possible pairs of 
2^ values of a variable— 12. Correlation due to heterogeneity 
of material— IS. Reduction of correlation due to mingling 
of uncoirelated with correlated material — 14-17* The 
weighted mean— 18-19 Application of weighting to tho 
correction of death-rates, etc., for varying sex and age- 
distnbutions— 20 The weighting of forms of average other 
than the arithmetic mean 210-228 


CHAPTER XII. 

PARTIAL CORRELATION, 

1-2, Introductoiy explanation— 8. Direct deduction of the formula 
for two variables — 4. Special notation for the general 
case . generalised regressions — 6. Generalised correkttona— 

6, Generalised deviations and atandaid- deviations — 

7-8, Theorems conc^erning the generalised product-sums — 

9, Direct interpretation of the generalised regressions — 

10-11. Reduction of the generalised standard-deviation — 

12 Reduction of the generalised i egression— 18. Reduction 
of the generalised correlation coefficient— 14. Arithmetical 
work Example i ; Example xi. — 16. Geometrical repre- 
sentation of correlation between three variables by means of 
a model— 1 6 The coefficient of n.-foM correlation— 1 7. Ex- 
pression of regressions and correlations of lower in terms of 
those of higher order — 18. Limiting inequalities between 
the values of correlation-coefficients necessary for consist- 
ence — 19. Fallacies 229-263 



CONTENTS 


PART IIL— THEORY OF SAMPLING. 


CHAPTER XIII. 

SIMPLE SAMPLING OP ATTRIBUTES. 

1, The problem of the present Part—2 The two chief divisions of 
the theoiy of sampling — 3 Limitation of the discussion to 
the case of simple sampling— 4. Definitipn of the chance of 
success or failure of a given event — 5 Determination of the 
mean and standard -deviation of the number of successes in 
71 events— 6 The same foi the proportion of successes in % 
events : the standard-deviation of simple sampling as a 
measure of unieliability or its reciprocal as a measure of 
precision — 7. Venfication of the theoretical results by 
penment — 8. More detailed discussion of the assuniptions 
on which the foimula for the standard-deviation oi simple 
sampling is based — 9-10 Biological cases to which the 
theory is directly applicable— 11. Standard-deviation of 
simple sampling when the numbeis of observations in the 
samples vaiy— 12. Approximate value of the standard- 
deviation of simple sampling, and relation between mean 
and standard-deviation, when the chance of success or 
failure is very small— 13 Use of the standard-deviation of 
simple sampling, or standard error, for checking and con- 
trolling the interpretation of statistical results . , . 254^275 


CHAPTER XIV. 

« 

SIMPLE SAMPLING CONTINUED: EFFECT OF BEMOV- 
ING THE LIMITATIONS OF SIMPLE SAMPLING 

1, ’Wariimg as to the assumption that three times the standard 
error gives the range for the majority of fluctuations of 
simple sampling of either sign — % Warning as to the use 
of the observed for the true value of p in the formula for 
the standard error — 3 The inverse standaid error, or 
standaid error of the true proportion for a given observed 
propoition • equivalence of the direct and xnveise standard 
errors when n is large — 4-8 The importance of errors 
other than fluctuations of “simple sampling” in practice; 
unrepresentative or biassed samples — 9-10. Effect of diver- 
gences from the conditions of simple sampling {a) effect 
of variation in j:? and q for the several univeises from which 
the samples are drawn — 11-12 (&) Effect of variation in 
p and S' fiom one sub class to another within each universe — 

13-14, (c) Effect of a correlation between the results of the 
! several events — 15 Summary • • • • • 276-290 



XIV 


CONTENTS. 


CHAPTER XV. 

THE BINOMIAL DISTRIBUTION AND THE 
NORMAL CURVE 


PA OKS 

1-2 Detemination of the fiequency distribution for the number 
of successes in n events the binomial distiibution — 3, 
Dependence of the foini ot the distribution on and n — 

4-6. Giaphical and mechanical methods of forming re- 
presentations of the binomial distiibution— 6 Diiect 
calculation of the mean and the standaid-deviation from 
the distiibution -7-8 Necessity of deducing, for use in 
many piactical cases, a continuous curve giving approxi- 
mately, foi laige values of the terms of the binomial 
senes — 9 Deduction of the normal curve as a limit to the 
symmetrical binomial— 10-11. The value of the central 
ordinate — 12 Gompaiison with a binomial distribution for 
a moderate value of n — 13 Outline of the moie general 
conditions from which the curve can be deduced by advanced 
methods — 14. Fitting the curve to an actual senes of 
ohseivations — 15 Difficulty of a comj)lete test of ht by 
elementary methods — 16. The table ot areas of the nonnal 
curve and its use —17 The quartile deviation and the 
“probable erior” — 18 Illustiations of the application of 
the normal curve and of the table of areas . . 291-316 


CHAPTER XVL 

NORMAL CORRELATION. 

1-3 Deduction of the geueial expxession for the normal correlation 
surface from the case ot independence — 4. Constancy of the 
standard-deviations of parallel airays and linearity of the 
regiession— 5 The contour lines : a series of concentric and 
similai ellipses— 6. The normal surface for two correlated 
vaiiables legarded as a noimal surface for uncorrelated vaii- 
ables rotated with respect to the axes of measurement, 
arrays taken at any angle across the surface are normal 
distnbutions with constant standard-deviation distribution 
of and coil elation between linear functions of two normally 
coi related vaiiables are noi mal principal axes— 7. Standarcf- 
deviations lound the j)rincipal axes— 8-11. Investigation of 
Table III., Chapter IX, to test noimality* linearity of 
regiession, constancy of standard-deviation of airays, 
noimality of distiibution obtained by diagonal addition, 
contour lines — 12-13 Isotiopy of the normal distiibution 
for two vaiiables — 14. Outline of the principal properties of 
the normal distribution for n vanables , . , 817-334 



CONTENTS. 


XT 


CHAPTER XYII. 

THE SIMPLER OASES OP SAMPLINO FOR VARIABLES : 
PERCENTILES AND MEAN. 

PAGES 

1~2. The problem of sampling for variables the conditions 
assumed — 3 Standard erior of a percentile — 4. Special 
values for the percentiles of a normal distribution— 5. 

Effect of the form of the distribution geneially — 6. Simplified 
formula for the case of a grouped frequency-distribution— 7. 

Coil elation between eriors in two peicentiles of the same 
distribution — 8 Standard error of the interquartile xange 
for the normal curve — 9 Effect of removing the restrictions 
of simple sampling, and limitations of interpretation — 10, 

Standard error of the arithmetic mean — 11 Relative sta- 
bility of mean and median in sampling — 12 Standard error 
of the difference between two means— 13. The tendency to 
noimality of a distiibution of means — 14 Effect of removing 
the restrictions of simple sampling — 15 Statement of the 
standard errors of standaid-deviation, coefficient of variation, 
correlation-coefficient, and regression — 16 Restatement of 
the limitations of interpretation if the sample be small . 335-356 

Appendix I. — Tables for facilitating Statistical Woik . . . 357-859 

Appendix II — Short List of Woiks on the Mathematical Theoiy 

of Statistics, and the Theory of Probability . 360-361 

Supplements— 

I Notes Supplementary to Chap VI ... 362-364 

II DiREor Deduction of the PoRMULiE for Regressions 365-866 

III The Law of Small Chances . . 366-370 

IV Goodness of Fit . 370-389 

Additional References . 390-407 

Answers to, and Hints on the Solution of, the Exercises 

given . , . , . . 409-416 


Index 


. 417-434 




THEORY OP STATISTICS. 


INTEODUCTION. 

1-3. The introduction of the teims “ statistics/* statistical/* into the English 
language— 4-6 The change in meaning of these terms during the 
nineteenth century— 7-9 The present use of the teims— 10 Defini- 
tions of “statistics/’ “ statistical methods/* “theory of statistics/’ in 
accoi dance with present usage. 

1 The words “statist/’ “statistics/’ “statistical,” appear to be 
all derived, moie or less indirectly, fi-om the Latin status^ in the 
sense that it acquired in mediaeval Latin of a political state 

2 The first term is, however, of much eailier date than the two 
others. The word “statist” is found, for instance, in Hamlet 
(1602),^ Oymbehne (1610 or 1611), ^ and m Paradise Regained 
(1671).^ The earliest occurrence of the word “statistics” yet 
noted is in The Elements of Universal ErudiHon^ by Baron J. F 
von Bielfeld, translated by W Hooper, M D (3 vols , London, 1770) 
One of its chapters is entitled Statistics, and contains a definition 
of tho subject as “The science that teaches us what is the politi- 
cal arrangement of all the modern states of the known world.” ^ 
“ Statistics ” occurs again with a rather wider definition in the 
preface to A Political Survey of the Present State of Europe, by 
E, A. W Zimmermann,® issued m 1787 “It is about foity 
years ago,” sajs Zimmeimann, “that that branch of political 
knowledge, which has for its object the actual and relative 
power of the several modern states, the power arising from their 
natural advantages, the industry and civilisation of their inhabit 
ants, and the wasdom of their governments, has been formed, chiefly 
by German writers, into a separate science . . By the more con- 
venient form it has now received .... this science, distinguished 
by the new-comed name of statistics, is become a favourite study 
in Germany ” (p ii) ; and the adjective is also given (p. v), “ To 
the several articles contained in this work, some respectable 

^ Act V., sc 2. ® Act 11 ., sc 4. ^ 

* I cite from Dr W. F Willcox, Quarterly Publications of the American 
Statistical Association, Yo\ xiv.,1914, p 287 
® Zimmermanu’s work appears to have been written in English, though he 
was a German, Professor of Natural Philosophy at Brunswick. 


1 



2 


THEORY OF STATISTICS. 


statistical writers have added a view of the principal epochas of the 
history of each country.” 

3. Within the next few yeais the words were adopted by several 
writex’s, notably by Sir John Sinclair, the editor and organiser of the 
first Statistical Account of Scotland,'^ to whom, indeed, their intro- 
duction has been frequently ascribed. In the circular letter to the 
Clergy of the Church of Scotland issued in May 1790,*-^ he states 
that in Germany ** ‘Statistical Inqunies,’ as they are called, have, 
been carried to a very great extent,” and adds an explanatory 
footnote to the phrase “Statistical Inquiries” — “or inquiries 
respecting the population, the political circumstances, the pro- 
ductions of a country, and other matteis of state ” In the 
“ History of the Origin and Progress ” ^ of the work, he tells us, 
“ Many people were at first surprised at my using the new words. 
Statistics and Statistical^ as it was supposed that some term in our 
own language might have expressed the same meaning But in 
the course of a very extensive tour, through the northern parts of 
Europe, which I happened to take in 1786, I found that m 
Germany they were engaged m a species of political enquiry, 
to which they had given the name of Statistics . . as I 

thought that* a new word might attract more public attention, 

I resolved on adopting it, and I hope that it is now completely 
naturalised and incorporated with our language.” This hope 
was certainly justified, but the meaning of the word underwent 
rapid development during the half century or so following its 
introduction. 

4 “Statistics” (statistik), as the term is used by Geiman 
writers of the eighteenth century, by Zimmeirnann and by Sir 
John Sinclair, meant simply the exposition of the noteworthy 
characteristics of a state, the mode of exposition being — almost 
inevitably at that time — preponderantly verbal. The conciseness 
and definite character of numerical data were recognised at a 
comparatively early period — more particularly by English wntcis 
—but trustworthy figures w^ere scarce After the commencement 
of the nineteenth century, however, the growth of official data 
was continuous, and numerical statements, accordingly, began 
more and more to displace the verbal desciiptions of earlier days. 
“ Statistics ” thus insensibly acquired a narrower signification, vi«:., 

^ Twenty-one vols., 1791-99 

® Statistical Account, vol xx., Appendix to “The History of the Origin and 
Progiess . . ” given at the end of the volume. 

® Loc cit , p. xm 

* The Abnss der Statsmssemeha/t der JEuropdischen Mewhe (1749) of Gottfned 
Acheiiwall, Professor of Politics at Gottingen, is the volume in which the word 
‘'statistik” appears to be first employed, but the adjective “ statisticus ” 
occurs at a somewhat earlier date in works written in Latin. 



INTKODUCTION. 


3 


the exposition of the chamcteristfcs - of a State by numericaV 
methods It is difficult to say-^at wiiat ppoch the word Qame 
definitely to bear this quantitative meaning, "but* ** the transition 
appears to have been only half accooaplish^ even after the founda- 
tion of the Royal Statistical Society m 1834 The articles in the 
first volume of the Journal^ issued in 1838-9, are for the most 
part of a numerical character, but the ojB&cial definition has no 
reference to method “ Statistics,” we read, “may be said, in the 
words of the prospectus of this Society, to be the ascertain- 
ing and bunging together of those facts which are calculated to 
illustrate the condition and prospects of society ” ^ It is, however, 
admitted that “ the statist commonly prefers to employ figures 
and tabular exhibitions ” 

5 Once, however, the first change of meaning was accomplished, 
further changes followed. From the name of a science or art of 
state-description by numerical methods, the word was transferred to 
those series of figures with which it operated, as we speak of vital 
statistics, poor-law statistics, and so forth. But similar data 
occur in many connections , in meteorology, for instance, in anthro- 
pology, etc Such collections of numerical data were also termed 
“ statistics,” and consequently, at the present day, the word is 
held to cover a collection of numerical data, analogous to those 
which were originally formed for the study of the state, on almost 
any subject whatever We not only read of rainfall “statistics,” 
but of “statistics” showing the growth of an organisation for 
recording rainfall ^ We find a chapter headed “Statistics” in a 
book on psychology,^ and the author, writing of “ statistics con- 
cerning the mental characteristics of man,” “ statistics of children, 
undef the headings bright — average — dull.”^ We are informed 
that, in a book on Latin verse, the characteristics of the Virgilian 
hexameter “ are examined carefully with statistics ” ^ 

6 The development in meaning of the adjective “ statistical ” 
was natujally similar The methods applied to the study of 
numerical data concerning the state were still teiined “statistical 
methods,” even when applied to aata trom other sources. Thus 
we read of the inheritance of genius being treated “ in a statistical 
manner,”® and we have now “a journal for the statistical 
study of biological problems ” ^ Such phrases as “ the statistical 

* Jour Stat Soc , vol i p 1 

^ Symons’ Brttibh Rainfall for 1899, p 15 

* E W. Scnpture, The New Psychology, 1897, chap li. 

^ Op cit p 18 

^ Athenceum, Oct 3, 1903 

** Erancis Gallon, Hereditary Genius (Macmillan, 1869), preface. 

Biometrika, Cambridge Umv Press, the first number issued m 1901 



4 


THEORY OF STATISTICS. 


investigation of the motion of molecules ” ^ have become part of 
the, ordinary language of physicists We find a work entitled 
“the principles df statistical mechanics,” ^ and the Bakerian 
lecture for 1909, by Str J. Larmor, was on “ the statistical and 
thermodynamical relations of radiant energy ” 

7 It IS unnecessaiy to multiply such instances to show that the 
words “statistics,” “statistical,” no longer bear any necessary 
reference to “ matters of state ” They arc applied mdifierently in 
physics, biology, anthropology, and meteorology, as well as in the 
social sciences Diverse though these cases are, there must be 
some community of character between them, or Uio same terms 
and the same methods would not be applied What, then, is this 
common character 1 

8, Let us turn to social science, as the parent of the methods 
termed “ statistical,” for a moment, and consider its characteristics 
as compared, say, with physics or chemistry. One characteristic 
stands out so markedly that attention has been repeatedly 
directed to it by “ statistical ” writers as the source of the peculiar 
difficulties of their science observer of social facts cannot ex- 
penment^ hut must deal wi(?i circtmstaiices as they occur ^ apart 
from hts control Now the object of experiment is to replace the 
complex systems of causation usually occuinug in nature by 
simple systems in which only one causal circuiuvstauce la permitted 
to vary at a time. This simplification being impossible, the 
obseiver has, m general, to deal with highly com])licated cases of 
multiple causation — cases in which a given result may bo due to 
any one of a number of alternative causes or to a number of 
different causes acting conjointly. 

9. A little consideiation will show, however, that this is also 
precisely the characteristic of the observations in other (lelds to 
which statistical methods are applied. ^I’he meteorologist, for 
example, is in almost precisely the same position as the student 
of social science. He can experiment on minor points, but the 
records of the barometer, thermometer, and rain gauge have to be 
treated as they stand With the biologist, matters are in some- 
what better case He can and does apply experimental methods 
to a very large extent, but frequently cannot approximate 
closely to the experimental ideal , the internal circumstances of 
animals and plants too easily evade complete control Hcnco a 
large field (notably the study of variation and heredity) is left, 
in which statistical methods have either to aid or to replace the 
methods of experiment. The physicist and chemist, finally, 

1 Clerk Maxwell, “Theory of Heat” (1871), aud “On Boltzmann’s 
Theoiom” (1878), Phil Trans , v-ol. xu 

By J Willard Gibbs (Macmillan, 1902) 



INTRODUCTION, 


5 


stand at the other extremity of the scale. Theirs are the 
sciences in which experiment has been brought to its greatest 
perfection But even so, statistical methods still find application 
In the first place, the methods available for eliminating the effect 
of disturbing circumstances, though continually impioved, are not, 
and cannot be, absolutely perfect The observer himself, as well 
as the observing instrument, is a source of error , the effects of 
changes of temperature, or of moisture, of pressure, draughts, vibra- 
t'lon, cannot be completely ehmmated Further, in the problems 
of molecular physics, referred to in the last sentences of § 6, 
multiplicity of causes is of the essence of the case. The motion 
of an atom or of a molecule in the middle of a swarm is dependent 
on that of every other atom or molecule in the swarm 

10. In the light of this discussion, we may accordingly give the 
following definitions — 

By statistics we mean quantitative data affected to a marked 
extent by a multiplicity of causes 

By statistical methods we mean methods specially adapted to 
the elucidation of quantitative data affected by a multiplicity of 
causes 

By theory of statistics we mean the exposition of statistical 
methods 

The insertion m the fiist definition of some such words as *‘to 
a marked extent ” is necessary, since the term “ statistics is not 
usually applied to data, like those of the physicist, which are 
affected only by a relatively small residuum of disturbing causes 
At the same time, statistical methods are applicable to all such 
cases, whether the influence of many causes be large or not. 


REFERENCES 

The History of the Words “Statistics,'* “Statistical” 

(1) John, V , Dqt Name Statistik ^ Wehs, Berne, 1883. A translation in 

Jour Roy Stat Soc for same year 

(2) Yule, G. U , “The Introduction of the Words ‘Statistics,’ ‘Statistical,’ 

into the English Language,” Jour Roy Stat Soc,, vol, Ixvm , 1905, 
p 391. 


The History of Statistics m G-eneral. 

(3) John, V., OescMchte der Statzsh/c, 1^® Teil, bis auf Quetelet , Enke, 

Stuttgart, 1884 (All published , the author died lu 1900 By far the 
best history of statistics down to the early yeais of the nineteenth 
century ) 

(4) Mohl, Robert von, Geschuhte und LitUratur der Staatswissenschaften, 

3 vols , Enke, Erlangen, 1855-58. (For history of statistics see 
principally latter half of vol. iii') 



6 


THEORY OF STATISTICS. 


(6) Gabaolio, Antonio, Teona generate della stattsUca^ 2 vola ; Hoeph, 
Milano, 2nf] edn , 1888 (Vol i., Parte stonca ) 

Several works on theory of statistics include short histones, e q. 
H. Westergaard’s Die Grundzugc der Theorie der (Fischer, 

Jena, 1890), and P A Meitzeii’s GesUiiehte, Thcone vnd Techmk der 
Stati^tik (new edn , 190*8 , Amencaii translation by R ? Falkner, 
1891) There is no detailed history in English, but the aiticle 
“Statistics” in the Fncyctopccdia Britannua (11th edn.) gives a very 
slight sketch, and the biogiajihical articles in Palgrave’s DhUi07iary of 
Political Economy aie useful. For its importance as regards the English 
school of political arithmetic, leference infiy also bo made to — ^ 

(6) Hull, 0 H , Tlw Economic Writings of Sir William Petty, together 
imth the Observations on the Bills of Mortality more probably hy Captain 
John Orannt, Cambridge University Press, 2 vols,, 3899. 


History of Theory of Statistics. 

Somewhat slight information is given in the general works cited. 
From the purely mathematical side the following is important: — 

(7) Todhunter, I , A History of the Mathematical Theory of Probability 
from the time of Pascal to that of Laplace ; Macmillan, 1865. 


History of Official Statistics. 

(8) Berttllon, J , Conrs iUmentaire de statistigue , Soci^t^ d’^ditions 
scicntifiques, 1895. (Gives an exceedingly useful outline of the history 
of official statistics in dillerent countiies ) 



PART L— THE THEORY OE ATTRIBUTES. 


CHAPTER I. 

NOTATION AND TEEMINOIOGY. 

1-2. Statistics of attnbutes and statistics of variables fundamental character 
of the former — 3-5 Classification by dichotomy — 6-7. N'otation for 
single attnbutes and for combinations — 8 The class-frequency —9. 
Positive and negative attributes, contranes — 10 The order of a class— 
11. The aggregate— 12 The arrangement of classes by order and 
aggregate — 13-14 Sufficiency of the tabulation of the ultimate class- 
frequencies — 15-17 Or, better, of the positive class-fiequencies — 18 
The class-frequencies chosen in the census for tabulation of statistics 
of infirmities — 1 9. Inclusive and exclusive notations and terminologies, 

1 The methods of statistics, as defined in the Introduction, 
deal with quantitative data alone The quantitative character 
may, however, arise m two different ways 

In the first place, the observer may note only the presence or 
absence of some attribute m a series of objects or individuals, and 
count how many do or do not possess it. Thus, m a given 
population, we may count the number of the blind and seeing, 
the dumb and speaking, or the insane and sane. The quantitative 
character, m such cases, arises solely m the counting. 

In the second place, the observer may note or measure the 
actual magnitude of some variable character for each of the 
objects or individuals observed He may record, for instance, the 
ages of persons at death, the prices of different samples of a 
commodity, the statures of men, the numbers of petals m flowers 
The observations in these cases are quantitative ah initio. 

2 The methods applicable to the former kind of observations, 
which may be termed statistics of attnbutes, are also applicable 
to the latter, or statistics of variables. A record of statures of 
men, for example, may be treated by simply counting all measure- 
ments as tall that exceed a certain limit, neglecting the magnitude 
of excess or defect, and stating the numbers of tall and short (or 

7 



8 


THEORY OF STATISTICS. 


more strictly not-tall) on the basis of this classification. Similarly, 
the methods that are specially adapted to the treatment of 
statistics of variables, making use of eacli value rccoided, are 
available to a greater extent than might at first sight seem possible 
for dealing with statistics of attributes. Foi example, we may 
treat the presence or absence of the attribute as coi responding to 
the changes of a variable which can only possess two values, say 
0 and 1 Or, we may assume that we have ically to do with a 
variable character which has been ciudely classified, as suggested 
above, and we may be able, by auxiliaiy hypotheses as to the 
natuie of this variable, to draw further conclusions. But the 
methods and principles developed for the case in which the observer 
only notes the presence or absence of attributes are the simplest 
and most fundamental, and arc best considered first. This and 
the next three chapters (Chapters I.-IV.) are accordingly devoted 
to the Theory of Attributes. 

y'S The objects or individual s that possess the attribute, and 
those that"db not possess it. may be said^onBre^^ n^^^ bt two 
distinct classes^^^^ ooserver classifying the objects or individuals 
observed In the simplest case, where attention is paid to one 
attribute alone, only two mutually exclusive classes are formed. 
If several attributes are noted, the process of classificatfon may, 

■ however, be continued indefinitely Those that do and do not 
possess the first attiibute may be reclassified according as they do 
or do not possess the second, the members of each of the sub- 
classes so formed according as they do or do not possess the 
third, and so on, every class being divided into two at each step 
Thus the membeis of the population of any district may bo 
classified into males and females , the members of each sex into 
sane and insane; the insane males, sane males, insane females, 
and sane females into blind and seeing If we were dealing with 
a number of peas {Fisvm saHvum) of different varieties, they 
might be classified as tall or dwarf, with green seeds or yellow 
seeds, with wrinkled seeds or round seeds, so that we would have 
eight classes — tall with round green seeds, tall with lound yellow 
seeds, tall with wrinkled green seeds, tall with wrinkled yellow 
seeds, and four similar classes of dwarf plants. 

A 4 It may be noticed that the fact of classification does not 
liecessarily imply the existence of either a natural or a clearly 
defined boundary between the two classes. The boundary may 
be wholly arbitrary, e,g. where prices are classified as above or 
below some special value, barometer readings as above or below 
some particular height. The division may also be vague and 
uncertain, sanity and insanity, sight and blindness, pass 
into each other by such fine gradations that judgments may 



I. — NOTATION AND TERMINOLOGY. 


9 


differ as to the clasvS in which a given individual should be 
entered The possibility of uncertainties of this kind should 
always be borne in mind, i n considering statistics of'attri^^e ^ 
whatever the nature of ttie classificatJonTTiowever^ n or 

artificial, definite ox unceitam, the final judgment must be de- 
cisive , any one object or individual must be held either to possess 
the given attribute or not 

5. A classification of the simple kind consideied, in which each 
clas^ is divided into two sub-classes and no more, has been termed 
by logicians classification, or, to use the more strictly applicable 

division by dichotomy (cutting in two}. The classifica- 
tions of most statistics are not dichotomous, for most usually a 
class IS divided into more than two sub-classes, but dichotomy is 
the fundamental case In Chapter Y the relation of dichotomy 
to more elaborate (manifold, instead of twofold or dichotomous) 
processes of classification, and the methods applicable to some 
such oases, are dealt with briefly 

6. For theoretical purposes it is necessary to have some simple 
notation for the classes formed, and for the numbers of observa- 
tions assigned to each. 

The capitals A, i?, G, . will be used to denote the several 
attiibutes An object or individual possessing the attribute 4 
wull be termed simply A The class, all the members of which . 
possess the attribute A, will be termed the class A. It is con- 
venient to use single symbols also to denote the absence of the 
attributes A, i?, C, , . We shall employ the Greek letters, a, 
fS, y, . . Thus if A represents the attribute blindness, a 
represents sight, i.e non-bhndness ; if B stands for deafness, ^ 
stands lot hearing Generally is equivalent to “non-A,” or 
an object or individual not possessing the attnbute A ; the class a 
is equivalent to the class none of the members of which possess the 
attribute A ; 

7. Combinations of attributes will be represented by juxta - 
positions of letter s Thus if, as above, A represents 

^ajness, AyiFepresents the combination blindness and deafness 
If the presence and absence of these attributes be noted, the four 
classes so formed, viz. AB, Aj^, aB, a/3, include respectively the 
blind and deaf, the blind but notdeaf, the deaf hut 'mt-blind, and ‘ 
the neither blind nor deaf If a third attiibute be noted, e g in- 
sanity, denoted say by G, the class ABG, includes those who are 
at once deaf, blind, and insane, ABy those who are deaf and blind 
hut not insane, and so on. 

Any letter or combination of letters like A, AB, aB, ABy, by 
means of which we specify the characters of the members of a class, 
may be termed a class symbol 



10 


THEOBY OF STATISTICS. 


8 The number of observations assigned to any class is termed, 
for brevitv .^the frequency of the class, or the cla^g-frequency 
Class-fje oue^ies bF encjumytlie conesporrduTg 

oTass-sy m b ols mTJrackets. Thus — 

(A) denotes number of -A 'a, » c objeUs possessing attiihute A 

a’s, ,, not ,, ,, A 

AB’a, „ i)ossessingattiibutes yl and B 

aB’s, ,, ,, ,, B but not A 

ABCh, „ „ „ /1,B, andC 

aBC\ „ „ „ B and C but not A 

(xjSCs, „ „ „ C bub neither y| nor B 

and so on for any number of attiibutes. Jf il represent, as in 
the illustration above, blindness, B deafness, C insanity, the 
symbols given stand for the numbeis of the blinds the not-blind, 
the blind and deaf^ the deaf hut not hlind^ the hhnd^ deaf and in- 
sane, the deaf and insane but not blind, and the insane hut neither 
blind nor deaf, respectively 

9. The attributes denoted by capitals ABC, . . . may be ^ 
termed positive attributes, a nd their contT'arie.s denote d by Greek 
letters negative attributes ""Tf a class-sym bolmclude only 
capital letters, the class may be termed a positiv e class ^ if only 
Greek letters, a _ negative c lass Thus the classes A, AB, ABC 
are positive classes , the clasps a, a/3, afBy, negative classes. 

If two classes are such that every attribute m the symbol for 
the one is the negative or contrary of the corresponding attiibute 
m the symbol for the other, they may be termed contra ry classes 
1 and their frequencies contrary frequencies ; e,g 
and aB, Aj^C and aBy, are pairs of contraries 

10. The classes obtained by noting say n attributes fall into 
jiatural group s according to the numbers of attributes used to 
specify t he respective classe s, and these natural groups should be 
borne in mind in tabulating the class-frequencies. A class 
specified by r attributes may be spoken of as a class of the rth 
order and its frequency as a frequency of the Hh order. Thus AB, 
AG, BG are classes of the second order; {A), (AjS), (aBC), 
(AByD), class-frequencies of the first, second, third, and fourth 
orders respectively. 

11. The classes of one and the same order fall into further 
groups a ccording to the actua J^attnbutes specified. Thus if three 
attributes A, B,0 have been noted, the classes oT the second order 
may be specified by any one of the pairs of attributes AB, AC, or 
BO (and their contraries). The series of classes or class-frequen- 
cies given by any one positive class and the classes whose symbols 
are derived therefrom by substituting Greek letters for one or 
more of the italic capital letters in every possible way will be 
termed an agjg^ gate. Thus {AB) {AjS) {aB) (a/3) form an aggre- 


&B) 

ti^C) 

(aBC) 

(ot/3C?) 



I. — NOTATION AND TERMINOLOGY. 


11 


gate of frequencies of the second order, and the twelve classes of 
the second order which can be formed wheie three attributes 
have been noted may be grouped into three such aggregates. 

12 Class-frequencies should, in tabulating, be arranged so that' 
frequencies of the same order and frequencies belonging to the 
same aggregate are kept together Thus the frequencies for the 
case of three attributes should be grouped as given below , the 
whole number of observations denoted by the letter being 
reckoned as a frequency of order zero, since no attributes are 
specified : — 


Order 0. 

F 



Order 1 

w 

(£) 

iC) 


(a) 

(/8) 

(y) 

Order 2. 

(AB) 

(AG) 

(BG) 


m 

(Ay) 

(By) 


(oB) 

(aC) 

m 



(ay) 


Order 3. 

(ABC) 

(aBG) 



(A By) (aBy) 

(A^C) (a/SC) 

(A^y) (afSy) j 

13. In such a complete table for the case of three attributes, 
twenty-seven distinct frequencies are given — 1 of order zei'o, 6 
of the first order, 12 of the second, and 8 of the third It 
is, however, m no case necessary to give such a complete 
statement 

The whole number of observations must cleaily be equal to the 
number of ^’s together with the number of a’s, the number of 

A^8 to the number of that are together with the number of 

^’s that are not ^ , and so on, — z e any class-frequency can always 
he expressed in terms of classfrequencies of higher order. Thus — 

= + (^^) + (ai5)-f* (a^) = eto I 

(AB) = (ABC) + {ABy) = etc 

Hence, instead of enumerating all the frequencies as under (1), 
no more need be given, for the case of three attributes, than 
the eight frequencies of the third older. If four attributes had 
been noted it would be sufficient to give the sixteen frequencies of 
the fourth order. 

The classes specified by all the attributes noted in any case, 
ie classes of the nth order m the ca se of n attributes, m ay be 



12 


THEOEY OF STATISTICS. 


termed ultimate and their frequencies the ultimate 

frequencies. Hence we may say that it is neve7 mcesRaru to 

All the others can 

Example i — (See reference 5 at the end of the chapter ) 
A number of school children were examined foi the presence 
or absence of certain defects of which three chief descriptions 
were noted, A development defects, B nerve signs, (J low 
nutrition. 

Given the following ultimate frequencies, find the frequencies 
of the positive classes, including the whole number of obser 
vations E* 


{ABC) 

57 

{aBC) 

78 

{ABy) 

281 

{aBy) 

670 

{A^C) 

86 

{ajiC) 

65 

im 

453 

{o-Py) 

8310 


The whole number of observations N is equal to the grand 
total; #=10,000 

/ The frequency of any first-order class, e,g, (A) is given by the 
/ total of the four third-order frequencies, the class-symbols for 
which contain the same letter — 

(ABO) + (ABy) + (A/SC) + (A/Sy) - (A) = 877. 

Similarly, the frequency of any second-order class, e g. (AB), is 
given by the total of the two thud-order frequencies, the class- 
symbols for which both contain the same pair of letters — 

(ABO) -f- (ABy) = (AB) = 338. 

The complete results are — 


N 

10,000 

{AB) 

338 

{A) 

877 

{AC) 

Ibc) 

143 

{B) 

1,086 

136 

(G) 

286 

{ABC) 

67 


14. The number of ultimate fre auencios m th e genoial case of 
n attributes, or the numbe r of ^classes m aiT^^^^S^oTTIie^h 
order,«i£giS!I2y2§5Sdffing that eacirre£Ter^fffie "class-symtior 
maj Jbe written m two ways (J or a, ana TMt 

either way ol ^vrltmg one letter ma y Ibe combined" with either 
way ^f w riting anoffi^^TSence the wOe^lnumber'^b^"^ ih 
whicli^ J ^ss5ymEoi may be written, % e. the number of 
classes, 

2 X 2 X 2 X 2 ... . 


e riumerate more than the idtimaJ^ fimuencie 
beobtainod fTom^" simple addition 



I — ^NOTATION AND TEKMINOLOGY. 


13 


The ultimate frequencies form one natural set in terms of which 
the data are completely given, but any other set containing the 
same number q/l algebraically independent frequencies^ viz, 2”, 
may be chosen instead* 

15 The positive class-frequencies, J^t:^iAr_t Kla■head t-he 

total nu mber of observations form one^suchjet They ar ^alge- 
^ JoraicaTIy indenendehi : no on e poaitiVe class-frequency can be ex- 
pre'^ed w hoUv irT of t^eotEersT^lRb^^ is, moreover, 

"may 5e readily see^ from the fact that if the Greek letters 
aie struck out of the symbols for the ultimate classes, they become 
the symbols for the positive classe s, with the excepHon oF 
... for which F must be subst ituted . Otherwise the numoer 
IS made up S^loIIdws"— ^ 

Older 0 (The whole number of observations) , , 

Order 1 (The number of attubutes noted) .... 

Oidei 2 (The number of combinations of n things 2 together) 


1 

n 

71 - 1 ) 

1.2 


Oidei 3 
and so on. 


(The number of combinations of % things 3 together) 


But the senes 


1 “h W -h 


n{n - 1) ^ n(n - l)(?i - 2) 

-j;-y~+ 133 + 


IS the binomial expansion of (1+1)” or 2”, therefore the total 
numbei of positive classes is 2^*. 

16. The set of positive class-frequencies is a most convenient 
one for both theoretical and practical purposes 

Compare, for instance, the two forms of statement, in terms of 
the ultimate and the positive classes respectively, as given m 
Example i , § 13 The latter gives directly the whole number of 
observations and the totals of ^’s, .5’s, and O’s. The former gives 
none of these figures wi thmijv-thft pftrfnr- 

mancft qf lflftfi. l.Q jj.gthy q ,drlitinns Fuither. the latter gives 

thekecQljAii^ ^ 

are only indirectly given by the frequencies of the ultimate classes 

17. The expression of any class-frequency m teims of the 
positive frequencies is most easily obtained by a process of step- 
by-step substitution , thus — 




14 


THEORY OF STATISTICS. 


Arithmetical work, however, should be executed from first 
principles, and not by quoting formulae like the above 

Example II — Check the work of Example i , § 13, by finJimg the 
frequencies of the ultimate classes from the frequencies of the 
positive classes. 


'ABy) = {AB) - {ABC) - 338 - 57 « 281 
\Afiy) = (Ay) ^ (ABy) = (A) -(AG)- {ABy) 
«=877 - U3- 281 =453 


m - m - (^) - (^) + m - (^^y) 

= 10,000 - 1086 - 286 + 135 - 453 
= 10,135-1825 = 8310 


and so on 

18 Examples of statistics of precisely the kind now under 
consideiation are afforded by the census returns, of 1891 or 
1901, for England and Wales, of persons suffering from different 
“infirmities,” any individual who is deaf and dumb, blind or 
mentally deranged (lunatic, imbecile, or idiot) being required to 
be returned as such on the schedule The classes chosen for 
tabulation are, however, neither the positive nor the ultimate 
classes, but the following (neglecting minor distinctions amongst 
the mentally deranged and the returns of persons who are deaf 
but not dumb) : — Dumb, blind, mentally deranged ; dumb and 
blind but not deranged ; dumb and deranged but not blind , 
blind and deranged but not dumb , blind, dumb, and deranged 
If, in the symbolic notation, deaf-mutism be denoted by A^ blind- 
ness by B, and mental derangement by C, the class-frequencies 
thus given are (A), {B), (C), {ABy), (A/SC), (aBC), (ABC) (cf. 
Censm of Emjland and Wales, ISQl, vol. iii , tables 15 and 16, 
p. Ivu Census of 1901, Summaij/ Tables, table xlix.). This set of 
frequencies does not appear to possess any special advantages. 

19. The symbols of our notation are, it should be remarked, 
used in an inclusive sense, the symbol A, for example, signifying 
an object or individual possessing tlio attribute A with or without 
others This seems to be the only natural use of the symbol, 
but at least one notation has been constructed on an exclusive 
basis {cf ref 5), the symliol A denoting that the object or in- 
dividual possesses the attribute A, but not B or 0 or D, or what- 
ever other attributes have been noted An exclusive notation is 
apt to be relatively cumbrous and also ambiguous, for the leader 
cannot know what attubutes a given symbol excludes until he 
has seen the wliole Irst of attributes of which note has been 
taken, and this list he must bear in mmd The statement that 
the symbol A is used exclusively cannot mean, obviously, that the 
object referred to possesses only the attribute A and no others 



I. — NOTATION AND TERMINOLOGY 


15 


whatever^ it merely excludes the other attributes noted in the 
particular investigation Adjectives, as well as the symbols which, 
may represent them, are naturally used m an inclusive sense, and 
care should theiefore be taken, when classes are verbally described, 
that the description is complete, and states what, if anything, is 
excluded as well as what is included, m the same way as our 
notation The terminology of the English census has not, m 
^ this respect, been quite clear The Blind ” includes those who 
are “ Blind and Dumb,” or “ Blind, Dumb, and Lunatic,” and so 
forth But the heading “ Blind and Dumb,” in the table relating 
to “ combined infirmities,” is used in the sense “ Blind and Dumb, 
but not Lunatic or Imbecile,” etc , and so on for the others In 
the first table the headings are inclusive, in the second exclusive 


REFEKENCES 

(1) Jevons, W Stanley, “On a General System of Numerically Definite 

Reasoning, ” Memoirs of the Manchester Lit and Fhil Soc , 1870 
Reprinted in Pure Logic and other Minor Works , Macmillan, 1890 
^ (The method used in these chapters is that of Jevons, with the notation 
slightly modified to that employed in the next three memoirs cited ) 

(2) Yulk, G U , “ On the Association of Attributes in Statistics, etc f* Phil 

Trans Poy Soc , Senes A, vol. cxciv , 1900, p 267 

(3) Yule, G. U , “On the Theoiy of Consistence of Logical Class-frequencies 

and its Geometiical Repiesentation,” Phil, Trans Roy Soc , Senes A, 
vol cxcvii., 1901, p 91 

(4) Yule, G U , “Notes on the Theory of Association of Attributes in 

Statistics,” vol. ii , 1903, p 121 (The first three sections 

of (4) aie an abstract of (2) and (3). The lemaiks made as regards the 
tabulation of class- frequencies at the end of (2) should be read m con- 
nection with the remarks made at the beginning of (3) and in this 
chapter cf footnote on 94 of (3). 

Material has been cited from, and reference made to the notation used in— 

(6) Waenee, F . and others, “ Report on the Scientific Study of the Mental and 
Physical Conditions of Childhood ” , published by the Committee, 
Paikes Museum, 1895 

(6) Waener, F , “Mental and Physical Conditions among Fifty Thousand 
Children, etc.,” Jbwr Roy Stat Soc , vol. lix , 1896, p 125 


EXERCISES. 

1 (Figures fiom ref (6) ) The following aie the numheis of boys observed 
with certain classes of defects amongst a number of school-children. A 


denotes development defects ; nerve signs , (7, 

low nutrition. 

{ABG) 

149 

iaBQ) 

204 

{ABy) 

738 

(aBy) 

1,762 

(A^O) 

226 

{a$C) 

171 

{My) 

1,196 

(a^y) 

21,842 

Find the frequencies of the 

positive classes 





16 


THEORY OP STATISTICS. 


2 (Figures from ref, (5) ) The following are the frequencies of the 
positive classes foi the gii Is in the same investigation — 


JV 

23,713 

{A£) 

587 

(A) 

1,018 

(AO) 

428 

(■B) 

2,016 

(BC) 

335 

(O') 

770 

(ABC) 

156 


Find the fiequencies of the ultimate classes 
3. (Figmesfrom Census j England and JFales, 189 vol lu.) Convert the^ 
census statement as below into a statement in teinis of [a) t)io })osiiivo, (6) 
the ultimate class-frequencies = blindness, j5 = deaf-mutism, (7= mental 
derangement 


N 

29,002,625 

(ABy) 

82 

W) 

23,467 

{A^O) 

380 

(^) 

14,192 

{aBG) 

600 

(0) 

97,383 

{ABO) 

25 


4. {Of. Mill’s Logic, bk. m., ch. xvii , and ref (1) ) Show thaiiyf, A 
occurs in a larger propoition of the cases where B is than where B is not, 
then will B occur in a larger proportion of the cases where A is than where 
A IS not i e. given {AB)/{B)>{A$)I{$), show that {AB)l{A)>{aB)l{a) 

5 {Of. De Morgan, Formal Logic, p. 163, and ref (1) ) Most -^’s are ^’s, 
most ^’s are (Ts hud the least number oi -4’s that are C’s, i.e. the lowest 
possible value of {AO). 

'' 6 Given that 

{A)=^{a)={£) = {e) = iN, 

show that 

{AB) = {a^), {A^)MaB). 

7. {Of ref. (2), § 9, Case of equality of contraiics ”) Given that 


and also that 
show that 


{A):::^{a)MB)^{B) = (0)^{y)^iN, 
2 {ABa)-{AB) j {A/!) f {BO) -IN. 


8 Measurements are made on a thuusaml husbands and a tliousand wives. 
If the measurements of the husbands exceed tlie mcasiuementH of the wives in 
800 cases for one measurement, m 700 cases foi another, and in 660 cases for 
both measurements, in how many cas(‘s will both measurements on the wife 
exceed the measuiements on the husband ? 



CHAPTER II. 


COHSISTEHOE. 

1-3 The field of observatiou or univei*se and its specification by symbols — 

4 Derivation of complex from simple relations by specifying the 
universe — 5-6 Consistence— 7-10. Conditions of consistence for one 
^ lOUatd for two attributes— 11-14 Conditions of consistence for three 
attributes 

1. Any statistical inquiry is necessarily confined to a certain' 
tune, space, or mateiial An investigation on the prevalence of 
insanity, for instance, may be limited to England, to England in 
1901, to English males m 1901, or even to English males over 60 
years of age in 1901, and so on. 

For actual work on any given subject, no term is required to 
denote the material to which the work is so confined the limits 
are specified, and that is sufficient Put for theoretical purposes 
some term is almost essential to avoid circumlocution. The ex- 
pression the universe of discourse, or simply the universe, used 
in this sense by wi iters on logic, may be adopted as familiar and 
convenient 

2. The universe, like any class, may he considered as specified 
by an enumeration of the attributes common to all its members, 
e g. to take the illustration of § 1, those implied by the predicates 
English^ male, over 60 years of age, living in 1901 It is not, in 
general, necessary to introduce a special letter into the class- 
symbols to denote the attributes common to all members of the 
universe We know that such attiibutes must exist, and the 
common symbol r5an be understood. 

In strictness, however, the symbol ought to be written : if, say, 
U denote the combination of attributes, English — male — over 60 
— living in 1901, A insanity, B blindness, we should strictly use 
the symbols — 

{U) —Number of English males over 60 Imng in 1901, 

{UA) = ,, insane English males over 60 living in 1901, 

\UB) - „ blind „ „ ,, 

{UAB) — ,, blind and insane English males over 60 hvmg in 1901, 

17 2 



18 


THEORY OF STATISTICS. 


instead of the simpler symbols F (A) (B) (AB), Similarly, the 
general relations (2), § 13, Chap I , using U to denote the common 
attiibutes of all the members of the universe and (U) consequently 
the total number of obseivations F, should in strictness be wiitten 
ID the form — 

(U) = (UA) + (Fa) - (FB) 4- (^7/3) - etc 

= (FAB) + (UA^) + (UaB) + (Fa/^) - etc. 

(FA) - (FAB) + (UAf^) - (FAC) -h (FAy) etc. 

(FAB) = (FABG) + (FABy) = etc 

3 Cleaily, however, vve might have used any other symbol 
instead of F to denote the attributes common to all the members 
of the universe, e.y A ov B ox AB or ABC, writing in the latter 
case — 

(ABO)^(ABCD)^(ABCh) 

and so on Hence any attnhute or combination of attributes 
common to all the class-symbols in an equation may be regarded as 
specifying the universe within which the equation holds good 
Thus the equation just written may be lead m words* “The 
number of objects or individuals m the univeise ABO is equal to 
the number of i)’s together with the number of not-jD’s within 
the same universe/' The equation 

(Aa)^(ABG)^(A[^C) 

may be read * “The numbei of jl'a is equal to the number of 
that are B together with the number of -d's that aie mA B 
within the universe C.” 

4. The more complex may bedonved from the simpler i*elations 
between class-fiequencies very readily by the process of specifying 
the universe. Thus staitmg from the Himple equation 

we have, by specifying the univeise as 

^N-{A)~{B) + {AB). 

Specifying the universe, again, as y, we havo 

(a.py) = (y) - {Ay) - (By) + {ABy) 

^N-{A)-{B)-{C) + {AB) + {A C) + {BC) - {ABC) 

5. Any class-frequencies which have been or miglit have been 
observed withm one and the same universe may be said to be 



n. — CONSISTENCE. 


19 


consistent with one another They conform with one another, 
and do not in any way conflict 

The conditions of consistence are some of them simple, but 
others are by no means of an intuitive character. Suppose, for 
instance, the data aie given — 


N 

1000 

{AB) 

42 

(^) 

525 

(AC) 

147 

{B) 

312 

m 

(ABC) 

86 

{0) 

470 

25 

— there is nothing 

obviously wrong with the figu 


Yet they 

are certainly inconsistent They might have been obseived at^ 
different times, m different places or on different material, but ^ 
they cannot have been observed in one and the same universe. I 
They imply, m fact, a negative value for {a/3y ) — 


(a^y) = 1000 - 525 *-312-470 + 42 + 147 + 86-25 
= 1000- 1307 + 275-25. 

= -57 


Clearly no class-fiequency can be negative. If the figures, 
consequently, are alleged to be the result of an actual inquiry in 
a definite universe, there must have been some miscount or 
misprint 

6 Generally, then, we may say that any given class-fiequencies 
are inconsistent if they imply negative values for any of the 
unstated frequencies. Otherwise they are consistent. To test the 
consistence of any set of 2” algebraically independent fiequencies, 
for the case of n attributes, we should accordingly calculate 
the values of all the unstated frequencies, and so verify the fact 
that they are positive This procedure may, however, be limited 
by a simple consideration If the ultimate class-frequencies aie 
positive, all others must be so, being denved from the ultimate 
frequencies by simple addition Hence we need only calculate 
the values of the ultimate class-frequencies m terms of those 
given, and verify the fact that they exceed zero. 

7 As we saw in the last chapter, there are two sets of 2" 
algebraically independent frequencies of practical importance, vv 
(1) the ultimate^ (2) the positive class-frequencies. 

It follows from what we have just said that there is only one 
condition of consistence for the ultimate frequencies, viz. that 
they must all exceed zero Apart from this, any one fiequency of 
the set may vary anywhere between 0 and co without becoming' 
inconsistent with the others 

For the positive class-frequencies, the conditions may he^ 



20 


THEORY OF STATISTICS, 


expressed symbolically by expanding the ultimate in terms of 
the positive frequencies, and writing each such expansion not 
less than zero We will consider the cases of one, two, and 
three attributes in turn 

8. If only one atiiibute be noted, say A, the positive frequencies 
are JV and (A). The ultimate frequencies are (A) and (a), wheie 

(a)^jr~(A) 

C 

The conditions of consistence are therefore simply 
(A)<i:0 J^-(A)<iO 

or, more conveniently expressed, 

(a) (A)^0 (b) . . . (1) 


These conditions are obvious ; the number of A^s cannot be less 
than zero, nor exceed the whole number of observations. 

9 If two attributes be noted there aie four ultimate frequencies 
{AB)j (AIS), (aB)^ (a/?) The following conditions are given by 
expanding each in terms of the frequencies of positive classes — 


(a) (AB)A;iO or (AB) would be negative 

(i 

(d) „ (ai?) 


) {AB)A;{A) + {B)~J^ „ {a(i) „ 

) (AB)UA) „ {A!3) „ 


I 

) 


(2) 


(a), (c), and {d) aie obvious, (6) is perhaps a little less obvious, 
and is occasionally foi gotten It is, however, of precisely the 
same type as the other three. None of these conditions are 
really of a new form, but may be deiived at once from (1) (a) and 
(1) {b) by specifying the univorso as B or as /5? respectively. The 
conditions (2) arc therefore really covered by (1) 

10 But a fmther point arises as regards such a system of 
limits as is given by (2) The conditions {a) and {h) give lower or 
mmol limits to the value of {AB) , (c) and (of) give upper or 
major limits If either major limit be less than either minor limit 
the conditions are impossible, and it is necessary to see whether 
(i) and {B) can take such values that tins may be the case 
Expressing the condition that the major limits must be not less 
than the minor, we have — 


{^)<0 1 


WHro i 


These are simply the conditions of the form (1). If, therefore, 
(A) and (B) fulfil the conditions (1), the conditions (2) must be 



— CONSISTENCE. 


21 


possible. The conditions (1) and (2) therefore give all the con- 
ditions of consistence for the case of two attributes, conditions of 
an estiemely simple and obvious kind 

11. Now consider the case of three attributes There are 
eight ultimate frequencies Expanding the ultimate in terms of 
the positive frequencies, and expressing the condition that each 
expansion is not less than zero, we have — 


(a) (ABC)<iO 


!«) 

(d) 

(«) 

if) 

Iff) 

(h) 


<t(A£) + {AO)-(A) 
A:(AB) + (BG)-(B) 
A;:(AC) + (BC)-(C) 

X^C) 


or the fieqiiency given below 
will be negative 

(ABC)] 

{Ah) 

(oBy) 

(aj8C) 

(ABy) 
(Aj80 
(oBC) 


:i>(AB) + (AO) + (^0) -(A)-(B}-(C)+Br (a/3y) 


( 3 ) 


These, again, are not conditions of a new form We leave it 
as an exercise foi the student to show that they may be derived 
fiom (1) (a) and (1) (d) by specifying the universe in turn as 
jBO, By^ I3C, and fSy. The two conditions holding mfour imiverses 
give the eight inequalities above. 

12 As in the last case, however, these conditions will be im- 
possible to fulfil if any one of the major limits {e)-{h) be less than 
any one of the mmoi limits (a)-(d). The values on the right 
must be such as to make no major limit less than a minor 

There are four major and four minor limits, or sixteen compari- 
sons in all to be made But twelve of these, the student will 
find, only lead back to conditions of the form (2) for (AB), (AC), 
and (BC) respectively The four comparisons of expansions due 
to contrary frequencies ( (a) and {h\ {b) and (y), (c) and (/), (d) 
and (e) ) alone lead to new conditions, viz — 


(a) (AB) + (AC) + (BCH{A)^{B)^(C)-m 
(5) + 

(c) (AB)-(AC) + (^BC)UB) 

(d) - (AB) + (AC) + (0) 


• ( 4 ) 


13 These are conditions of a wholly new type, not derivable 
m any way from those given under (1) and (2). They are con- 
ditions for the consistence of the second-order frequencies w%th 
mch other ^ whilst the inequalities of the form (2) are only conditions 
for the consistence of the second-order frequencies with those of 
lower orders. Given any two of the second-order frequencies, 



22 


THEORY OF STATISTICS. 


(AB) and (AO), the conditions (4) give limits for the third, viz. 
(BC). They thus leplace, foi statistical purposes, the oidinaiy 
rules of syllogistic infcience Fiom data of the syllogistic foim, 
they would, of couise, lead to the same conclusion, though in a 
somewhat oumbious fashion, one or two cases are suggested as 
exercises for the student (Questions 6 and 7). The following 
will serve as illustiations of the statistical uses of the con- 
ditions — 

Example i — Given that (A)^{B) = (0) = JiF and 80 pei cent, 
of the -4's are i?, 75 per cent, of -4^s are G, find the limits to the 
peicentage of B'^ that are (7. The data are — 




N 


= 0 75 


and tlio conditions give — 

(а) -0 8 -0 75 

(б) <t:0 8 + 0 75-1 

(«) >1 -0 8 +0 75 

{d) >1 +08 -0 75 


(a) gives a negative limit and (d) a limit greater than unity, 
hence they may be disregarded From (b) and (c) wo have — 


2J^C) 

JSf 


<t:0-55 


^BC) 




— that is to say, not less than 55 per coni, nor more than 95 per 
cent of the /i’a can bo G 

Example ii — If a report give the following froqncncios as 
actually observed, show that there must bo a misprint or mistake 
of some sort, and that possibly the mispiiut consists in the 
dropping of a 1 before the 85 given as the frequency (BO), 



TV 1000 



(/ 

510 

{AB) 

189 

{B) 

490 

{A(J) 

140 

(0) 

427 

(BO) 

85 


From (4) (a) we have — 

{j5C7)<t:510 + 490H- 427 1000 ~ 189 - 140 
^98 

But 85 <98, therefore it cannot be the correct value of (BO) 
If we read 185 for 85 all the conditions are fulfilled. 



II. — CONSISTENCE 


23 


Example iii. — In a cei’tain set of 1000 observations (i)==45, 
(^)~23, ((7) — 14 Show that whatever the peicentages of 
that are A and of (7’s that are it cannot be inferred that any 
are G 

The conditions (a) and (6) give the lower limit of (j 5(7), which 
IS required. We find — 


(a) 

(i) 


iV ^ iVr jr 

(js+(A£}.045 


The first limit is clearly negative The second must also be 
negative, since {AB)l]Sf cannot exceed 023 nor {AC)IN *014. 
Hence we cannot conclude that there is any limit to {BG) greater 
than 0 This result is indeed immediately obvious when we 
consider that, even if all the were A, and of the remaining 
22 ^^s 14 were C% there would still be 8 A^s that were neither 
B nor G, 

14 The student should note the result of the last example, as it 
illustrates the sort of result at which one may often arrive by 
applying the conditions (4) to piactical statistics For given 
values of W, (A), (B), (C), {AB\ and (AG), it will often happen 
that anp value of (BG) not less than zero (or, more generally, not 
less than either of the lower limits (2) (a) and (2) (b) ) will satisfy 
the conditions (4), and hence no true inference of a lower limit is 
possible The aigument of the type ‘‘So many ^’s are B and 
so many i?’s are G that we must expect some to be G ” must 
be used with caution. 


REFERENCES. 

(1) Morgan, A nn, Formal LogiCf 1847 (chapter vm , “On the Numerically 

Definite Syllogism ”) 

(2) Boole, G , Laws of Thoughtf 1854 (chapter xix , “Of Statistical Condi- 

tions”) 

The above are the classical woiks with lespcct to the general theory 
of numerical consistence The student will find both difficult to follow 
on account of their special notation, and, in the case of Boole’s work, 
the special method employed 

(3) Yule, G U., “ On the Theory of Consistence of Logical Class-frequencies 

and its Geometiical Representation,” PhiL Trans , A, vol cxcvii 
(1901), p 91 (Deals at length with the theory of consistence for 
any number of attributes, using the notation of the present chapters ) 



24 


THEORY OR STATISTICS. 


EXERCISES. 


1 (For this and similar estimates cf, ‘ Repoit by Miss Collet on the 
Statistics of Employment of Women and Giils ” [C.— 7664] 1894). If, m the 
urban distiict of Bury, 817 per thousand of tlio women between 20 and 25 
years of age were returned as occupied” at the census of 1891, and 263 per 
thousand as mained or widowed, what is the lowest proportion pef thousand 
of the married or widowed that must have been occupied ^ 

2. If, in a senes of houses actually invaded by small-pox, 70 per cent, of the 
inhabitants are attacked and 85 per cent have been vaccinated, what is thh 
lowest percentage of the vaccinated that must have been attacked ^ 

3, Given that 60 per cent of the inmates of a workhouse are men, 60 per 
cent are “aged”(ovei 60), 80 per cent non able-bodied, 35 per cent aged 
men, 46 per cent, non-able bodied men, and 42 per cent non able-bodied and 
aged, find the greatest and least possible proportions of non able bodied aged 
men. 

4 (Material from ref 6 of Chap I ) The following are the proportions 
per 10,000 of boys observed, with certain classes of defects amongst a number 
of school-children. -4 = development defects, ^=nervo signs, D=mental 
dulness. m 

W =10,000 (D) =789 ^ 

(J)=: 877 (^.5) = 338 

(R)= 1,086 (;?/)) = 465 

Show that some dull boys do not exhibit development defects, and state how 
many at least do not do so. 

/ 5. The following are the corresponding figuies for girls . — 

AT =10,000 (D) =689 

(-^)= 682 {AB)^2iS 

(^)= 850 {/>’D)=363 


Show that some defectively developed giils are not didl, and state how many 
at least must be so 

' 6. Take the syllogism “ All ^’saieA, all A’s are 0, tlierefoiH' all A*ii aie 
express the premisses m terms of the notation of the picccdiug chapters, 
and deduce the conclusion by the use of the geneial conditions of consistence 
' 7. Do the same for the syllogism **A11 A's&reB, no /:?’s are (7, thoiofore 
no are Q” 

8. Given that (^) = (.R)=(G)=4W, and that {AB)/]Sf^(A(J)/I^=:p, find 
what must bo the greatest or least values of p m order that we may infer 
that [jBC)jN exceeds any given value, say q 
f 9. Show that if 



(B) 

JSf 


= 2r 


iV 




and 




the value of neither a; nor y can exceed J, 



CHAPTER III, 

ASSOCIATION 

1 -4. The criterion of independence — 5-10. The conception of association and 
testing for the same by the companson of pel centages— 11-12 
Rnmencal equality of the differences between the four second-order 
frequencies and then independence values — 13 Coefficients of associa- 
I tion — 14 Necessity for an investigation into the causation of an 

attribute A being extended to include jion-A^s 

1 If there is no sort of relationship, of any kind^ betw^n two 
attiibutes A and we expect to find the*'game""proportion of A's 
amongst the as amongst the non-i?’s We may anticipate, 
for instance, the same piopoition of abnoimally wet seasons in 
leap yeais as m ordinary years, the same proportion of male to 
total births when the moon is waxing as when it is waning, the 
same propoition of heads whether a coin be tossed with the right 
hand or the left 

Two such unrelated attributes may be termed independent, and 
we have accordingly as the criterion of independence for A ahd B — 

A) (/8) ^ ^ 

If this relation hold good, the corresponding relations 

(^) " (fi) 

■(AJS) (aB) 

(A) ~ (a) 

(MJM 

(1) (a) • 

must also hold For it follows at once from (1) that — 

(B)-(AJS)_m-(Afi) 

(B) ~ W) ’ 



26 


THEORY OF STATISTICS. 


that IS (aB) _ (a/?) 

W" w 

and the other two identities may be similarly deduced 

The student may find it easier to grasp the nature of the lela- 
tions stated if the frequencies aie supposed grouped into a table 
with two rows and two columns, thus . — 


Attiibute 

Attiibute 

Total 

B 


A 

(AB) 


(A) 

a 

(aB) 


(«) 

Total 

(B) 

(3) 



Equation (1) states a ceitam equality for the columns ; if this 
holds good, the corresponding equation 


{AB) _ (^) 


must hold for the rows, and so on. 

2. The criterion may, however, be put into a somewhat 
different and theoretically moie conveiiiout form The equation 
(1) expi esses {AB) m terms of {B), (/S), and a second-ordei fre- 
quency (A(3) j eliminating this second-oider fiequency we have — 

(ABHiAIS) (A) ' 

(ji) (B) + (A>r 

te. in woids, “the proportion of a-mongst the is the same 
as m the universe at large ” The Btudoiit should learn to recog- 
nise this equation at sight in any of the forms — 


(AB) 

M) 

(a) 

(^) ■ 

""F 

(AB) (B) 

(A) F 

(0 

(AB)-. 

_(A)(B) 

F 

(c) 

(AB) 
F " 

(^) . («) 
F 

(d) 


( 2 ) ' 


The equation (d) gives the impoitant fundamental ruin ; If the attri- 
lutes A and B are independent, the proportion of AB'b %ii the universe 
IS equal to the proportion of multiplied by the proportion of j5's 




Ill, — ASSOCIATION. 


27 


The advantage of the forms (2) over the foim (1) is that they 
give expressions for the second-order frequency in terms of the 
frequencies of the first order and the whole number of observa- 
tions alone ^ the form (1) does not. 

Example i — If there are 144 A’s and 384 ^^s in 1024 observa- 
tions, how many AB’s will there be, A and B being independent 

144x384__g, 

1024 


There will theisfoie be 54 AB’s 

Example ii — If the -4's are 60 per cent , the 35 per cent , of 
the whole number of observations, what must be the percentage 
of AB^s in order that we may conclude that A and B are 
independent '? 

60x 35 

Too- 


and therefore theie must be 21 per cent (more oi less closely, r/ 
§§ 7, 8 below) of AB'b m the universe to justify the conclusion 
that A and B are independent 

3. It follows from § 1 that if the relation (2) holds for any one 
of the four second-order frequencies, eg {AB), similar relations 
must hold for the lemaimng thiee Thus we have directly 
from (1) — 

{AH) (AB)^{Ali) (A) 

(H) ■ (B)+(H) N’ 


giving 




{jAm 

N 


and so on. This is seen at once to be tine on consideration 
of the fourfold table on p. 26 For if {AB) takes the value 
{A){B)IN'^ (A/S) must take the value {A){P)IE to keep the total 
of tile row equal to (-4), and so onfoi the other rows and columns 
The fouifold table in the case of independence must in fact have 
the form — 


Attnbuie 

Attribute 

Total. 

B 

& 

A 

{A){I})IN 

{AmiN 

{A) 

a 


(aX-8)/-'V 

(«) 

Total 

iS) 


N 




28 


THBOKY OF STATISTICS 


Emmple in — In Example i. above, vhafc would be the number 
of a^^s, A and B being independent 1 


(a)- 1024- 144 = 880 
(//)- 1024 -384- 610 


880 640 
1024 


4 Finally, the cutenon of independence may be expressed in^ 
yet a thud form, viz m tcims of the second-order frequencies 
alone If A and B are independent, it follows at once from the 
preceding section that — 




{A){B){am 

#2 


And evidently (aB){A^) is equal to the same fraction. 

Therefore — 

(AB){al3)^{aB){Ali) (a) 

(4^) = (Ai) (b\ 

M) • (a/3) ^ ^ 

(^-5) ^ M) M 

(A/3) (a^) 

The equation (b) may be lead “The ratio of A’s to a’s amongst 
the B'b is equal to the ratio of .4’s to a’s amongst the /i/’s,” and 
(c) similarly 

This form of cnteiion is a oonvoment one if all the four second- 
order frequencies are given, enabling one to recognise almost at a 
glance whether or not the two attiibutes aio mdepondent. 

Example iv. — If the sooond-order frequencies have the following 
values, are A and B independent or not ? 

(AJ5)=110 (a5)«90 (AP)«290 (a/i^)«510. 

Clearly {A )(a^) > {aB){A(3), 

so A and B are not independent 

5. Suppose now that A and B are not independent, but related 
in some way or other, however complicated 

Then if 

A and B are said to be positively associated, or sometimes simply 
associated. If, on the other hand, 

(abx^^, 




in — ASSOCIATION. 


29 


A and B are said to be negatively associated or, more briefly, 
disassociated 

The student should notice that these words are not used 
exactly m their ordinary senses, but in a technical sense. When 
A and B are said to be associated, it is not meant merely that 
some il^s are B% but that the numher of iL^s which are exceeds 
the number to he expected if A and B are independent. Similarly, 
when A and ^ aie said to be negatively associated or disassociated, 
'it IS not meant that no A^a are B% but that the number of A’a 
which are B\ falls short of the number to he expected if A and B 
are independent “ Association ” cannot be inferred from the mere 
fact that some A’s are -S’s, however great that proportion , this 
pimciple IS fundamental, and should be always borne in mind 

6. The gieatest possible value of {AB) for given values of 
W, (A), and {B) is either (A) or {B) (whichever is the less). When 
{AB) attains either of these values, A and B may be said to he 
completely or perfectly associated The lowest possible value of 
(AB\ on the other hand, is either zero or (A) + (B) - N (which- 
ever IS the greater) When (A^) falls to either of these values, 
A and B may be said to be completely disassociated. Complete 
association is generally understood to correspond to one or other 
of the cases, All A’s are j 8” or ‘^All B^a are A,” or it may be 
moie narrowly defined as corresponding only to the case when 
both these statements were true Coi^plete disassociation may 
be similarly taken as corresponding to one or other of the cases. 
*‘No A’s are Bf or ‘‘no a^s are or more narrowly to the 
case when both these statements are true. The greater the 
divergence of {AB) from the value (A) (J?)/W towards the limit- 
ing value in either direction, the greater, we may say, is the 
intensity of association or of disassociation, so that we may speak 
of attributes being mo? e or less^ highly or slightly associated. This 
conception of degrees of association, degrees which may m fact be 
m^sured by certain formulae {cf § 13), is important. 

^ 7. When the association is very slight, i e where {AB) only 
differs from {A){B)jN‘ by a few units or by a small proportion, it 
may be that such association is not really significant of any 
definite relationship, To give an illustration, suppose that a com 
IS tossed a number of times, and the tosses noted m pairs , then 
100 pairs may give such results as the following (taken from an 
actual record) — 


First toss heads and second heads 
,, ,, ,, tails 

First toss tails and second heads 
„ „ „ tails 


26 

18 

27 

29 



30 


THEORY OF STATISTICS. 


If we use A to denote “ heads ” in the first toss, “ heads '' in 
the second, we have from the above (A) = 44, (£) = 53 Hence 

(i)(5)/iV= = 23-32, while actually (.1^5) is 26. Hence 

there is a positive association, m the given recoid, between 
the lesult of the first throw and the lesult of the second. But it 
IS fairly certain, from the nature of the case, that such association 
cannot indicate any real connection between the results of the^ 
two throws; it must therefore be due merely to such a complex’ 
system of causes, impossible to analyse, as leads, for example, to 
differences between small samples diawn from the same material 
The conclusion is confirmed by the fact that, of a number of such 
records, some give a positive association (like the above), but 
others a negative association 

8 An event due, like the above occurrence of positive associa- 
tion, to an extremely complex system of causes of the general 
nature of which we are aware, but of the detailed operation of 
which we aie ignorant, is sometimes said to be due to chance^ or 
better to the chances or fluctuations of sampling. 

A little consideration will suggest that such associations due to 
the fluctuations of sampling must be met with in all classes of 
statistics. To quote, for instance, from § 1, the two illustrations 
there given of independent attributes, we know that m any 
actual record we would not be likely to find exactly the same 
piopoition of abnoimally wet seasons lu leap yeais as in ordinary 
yeais, nor exactly the same propoition of male births when the 
moon IS waxmg as when it is waning T^t so long as the divjUL" 
gence from uidejpoudeacn. is not well maiked we mus*El’bgmid such 
attub utes as practically^ mdcpeudent, or dependence as at 
unproved 

‘““itee discussion of the question, how great the divergence must 
be before we can consider it as “ well marked,’' must be postponed 
to the chapters dealing with the theoiy of sampling. At present 
the attention of the student can only be directed to the existence 
of the difticulty, and to the seuous risk of interpreting a “chance 
association ” as physically significant 

9 The definition of § 5 suggests that we are to test the 
existence or the intensity of association between two attributes 
by a comparison of the actual value of {AB) with its independence- 
value (as it may be termed) {A){B)IN The procedure is from the 
theoretical standpoint peihaps the most natural, but it is more 
usual, and is simplest and best in practice, to compare proportious, 
eg the proportion of A’s amongst the with the proportion 
amongst the /5^s. Such piopoitions are usually expressed in the 
form of percentages oi proportions per thousand 



Ill —ASSOCIATION. 


31 


It will be evident from §§ 1 and 2 that a large number of such 
comparisons are available for the purpose, and the question arises, 
therefoie, which is the best comparison to adopt 

10 Two principles should decide this point * (1) of any two 
comparisons, that is the better which brings out the more clearly 
the degree of association , (2) of any two comparisons, that is the 
better which illustrates the more important aspect of the problem 
undei discussion 

* The hist condition at once suggests that compaiisons of the 
form 


(AB) (AP) 
(5) (iS) 


, (a)' 


aie better than comparisons of the form 


(B) ^ N 


. ( 6 ) 


For it is evident that if most of the objects or individuals in the 
univeise are ^’s, % e li {B)jN appi caches unity, {AB)jiB) will 
necessarily approach {A)IN even though the difference between 
{AB)I(B) and {A^)l{^) is considerable The second foim of 
comparison may therefore be misleading. 

Setting aside, then, comparisons of the general form (5), the 
question remains whether to apply the comparison of the form (a) 
to the rows or the columns of the table, if the data are tabulated 
as on p 26 This question must be decided with reference to the 
second principle, % e. with regard to the moie important aspect of 
the problem under discussion, the exact question to be answered, 
or the hypothesis to be tested, as illustrated by the examples 
below* Where no definite question has to be answered or 
hypothesis tested both pairs of proportions may be tabulated, 
as in Example vi 

^Example v — Association between inoculation against cholera 
and exemption fiom attack (Data from Greenwood and Yule, 
Table III., ref 6 ) 



Not attacked. 

Attacked 

Total. 

Inoculated 

276 

3 

279 

Not inoculated 

473 

66 

539 

Total 

749 

69 

818 




THEORY OF STATISTICS, 


Here the impoitant question is, How far does inoculation 
protect from attack ^ The most natural compaiison is therefore — 

Percentage of inoculated who were not attacked . 98 9 

„ not inoculated „ „ . 87 8 

or we might tabulate the complementary piopoiiions — 

Percentage of inoculated who were attacked . ,11 

5 , not inoculated „ „ ,12 2 " 

Pkther compaiison bungs out simply and clearly the fact that 
inoculation and exemption from attack aie positively associated 
(inoculation and attach negatively associated) 

We are making above a comparison by lows in the notation of 
the table on p 26, comparing (J.jS)/(.d) with {aB)j{a)^ or iA^)/{A) 
with (a/i)/(a). A compaiison by columns, eg, (AB)I{B) with 
(A^)/(/3), would serve equally to indicate whether there was any 
appieciable association, but would not answer directly the 
particular question we have in mind — 


Percentage of not-attacked who were inoculated . . 36*8 

„ attacked „ „ . .43 

Example vi. — Deaf-mutism and Imbecility. (Material fio 
Census of 1901. Summary Tables [Cd 1523 ]) 

Total population of England and Wales . . 32,528,000 

Numbei of the imbecile (or feeble-minded) . 48,882 

Number of deaf-mutes .... 15,246 

Numbei of imbecile deaf-mutes . , 451 


Reqiined, to find whethei deaf-mutism is associated with 
imbecility. 

We may denote the number of the imbecile by (A), of deaf- 
mutes by {B). A compaiison of (AB)j{B) with {A)jJV or of 
{AB)I{A) with {B)IE may very well be used in this case, seeing 
that (A)/E SLiii {B)/E are both small The question whether to 
give the preference to the first or the second comparison depends 
on the nature of the investigation we wish to rnalco. If it is 
desired to exhibit the conditions among deaf-mutes the first may 
be used • — 

Proportion of imbeciles among deaf- ^ ^ i 3 

mutes = ( 45 )/( 5 ) J29-6 per thousand. ^ 

Proportion of imbeciles m the whole \ , 
population == (A )/W . . .J ” 



III. — ASSOCIATION. 


33 


If, on the other hand, it is desired to exhibit the conditions 
amongst the imbecile, the second will be preferable 


Proportion of deaf-mutes amongst 
the imbecile [AB)l{A) 

Proportion of deaf-mutes in the 
whole population {B)IF . 


9*2 per thousand. 


0 5 


93 


Either comparison exhibits very clearly the high degree of asso- 
ciation between the attributes It may be pointed out, however, 
that census data as to such infirmities are very untrustworthy 
Example vii. — Eye-colour of father and son (material due 
to Sir Francis Galton, as given by Professor Karl Pearson, Phil. 
Tiam , A, vol cxcv. (1900), p. 138; the classes 1, 2, and 3 of the 
memoir treated as light) 


Fathers with light eyes and sons with light eyes {AB) , 471 

» >> not light „ {A^) . 151 

„ not light 5, light „ {aB) . 148 

„ „ „ not light „ (a/5) . 230 

Required to find whether the colour of the son^s eyes is 
associated with that of the fathers In cases of this kind the 
father is reckoned once for each son , e y a family in which the 
father was light-eyed, two sons light-eyed and one not, would be 
reckoned as giving two to the class AB and one to the class AfB, 
The best comparison here is — 


Percentage of light-eyed amongst the sons \ . 

of light-eyed fathers . . . j 

Percentage of light-eyed amongst the sons 1 gg 
of not-light-eyed fathers • • J ” 


But the following is equally valid — 


Percentage of light-eyed amongst the 
fathers of light-eyed sons . 
Percentage of light-eyed amongst the 
fathers of not-light-eyed sons 


76 per cent. 


40 


» 


The reason why the former comparison is prefeiable is, that we 
usually wish to estimate the character of offspring from that of 
the parents, and define heredity in terms of the resemblance of 
offspring to parents. We do not, as a rule, want to make use of 
the power of estimating the character of parents from that of their 
offspring, nor do we define heredity in terms of the resemblance 
of parents to offspring Both modes of statement, however, 

3 



34 


THEORY OF STATISTICS. 


indicate equally clearly the tendency to resemblance between 
father and son 

Example viii — Association between inoculation against choleia 
and exemption from attack, five separate epidemics {cf Example 
V, data from Tables IX., X, XXVill , XXIX, XXXI of 
refeience 6) 



Not Attacked. 

Attacked 

Total. 

Inoculated 

192 

4 

196 

Not inoculated 

113 

34 

147 

Total 

305 

38 

343 


Not Attacked 

Attacked 

Total. 

Inoculated 

5,751 

27 

5,778 

Not inoculated 

6,351 

198 

6,649 

Total . 

12,102 

225 

12,327 


Not Attacked 

Attacked. 

Total. 

Inoculated 

4,087 

5 

4,092 

Not inoculated 

. 113,856 

1,141 

115,000 

Total . 

. 117,913 

1,149 

119,092 


Not Attacked. 

Attacked 

Total 

Inoculated 

, 8,332 

8 

8,340 

Not inoculated 

. 84,444 

556 

85,000 

Total . 

. 92,776 

564 

93,340 


Nut Attaolviid 

Atta(‘ked 

Total 

Inoculated 

4,870 

5 

4,875 

Not inoculated 

. 153,096 

901 

154,000 

Total 

. 157,966 

909 

158,875 


With the table of Example v. the above give data for six 
sepaiate epidemics, in all of which the same method of inooula- 



III. — ASSOCIATION. 


35 


tion appears to have been used the data refer to natives only, 
and the numbers of observations are sufficiently large to reduce 
“fluctuations of sampling” within reasonably narrow limits. 
The proportions not attacked are as follows — 

Proportion not Attacked. 



Not Inoculated 

Inoculated 

Difference. 

1 

. 0 8776 

0 9892 

01116 

2 

0 7687 

0 9796 

0 2109 

3 

0 9698 

0 9953 

0*0255 

4 

0 9901 

0 9988 

0 0087 

5 

0 9935 

0-9990 

0 0055 

6 . 

. 0 9941 

0 9990 

0*0049 


In each case inoculation and exemption from attach are positively 
associated, but it will be seen that the several proportions, and 
the differences between them, vary considerably Evidently in 
a very mild epidemic this difference can only be small, and the 
question aiises how far the data for the separate epidemics can 
be said to be consistent in their indication of the “ efficiency ” 
of the inoculation This is not a simple question to answer 
the more advanced student is referred to the discussion in the 
original 

11 The values that the four second-order frequencies take in 
the case of independence, viz. — 

(Am (am (Am (am 

# * F ’ F ’ F ’ 

are of such great theoretical importance, and of so much use 
as refeience values for comparing wuth the actual values of 
the frequencies (AB) (aB) (A/3) and (a/S), that it is often desir- 
able to employ single symbols to denote them We shall use 
the symbols — 


(ABm 


(Am 

F 


(a/3)o = 


W) 

F 




(A^m 


(Am 

F 



THEORY OF STATISTICS. 


36 

If 8 denote the excess of {AB) over (^^)o) then, in order to keep 
the totals of rows and columns constant, the geneial table 
(c/. the table for the case of independence on p. 27) must 
be of the form 


Attribute 

Attnbuto. 

Total 

B 

/3 

A 

Uj5)(, + S 


(A) 

a 

(«5)o-S 

(a^)o + 5 

(«) 

Total 

kS) 

(iS) 



Therefore, quite generally we have — 

{AB) - {AB),= {aP) - (a^)„ = {AP), - {AP) = (a5)o - (aB). 

12 The value of this common difFeience 8 may be expressed 
in a form that is useful to note. We have by definition — 

Z^{AB)-{AB),^{AB)-i£ip. 

Bring the terms on the right to a common denominator, and 
expiess all the frequencies of the numerator in terms of those of 
the second order ; then we have — 

“ . 

That is to say, the common difForence is equal to 1 /Wth of the 
difference of the cross products” {AB){a/3) and {aB)(A/3). 

It is evident that the difterence of the cross-pioducts may be 
very large if W be large, although 8 is really very small In 
using the difference of the cross-products to tost mentally the 
sign of the association in a case where all the four second order 
frequencies are given, this should be remembered : the difference 
should he compared with W, or it will be liable to suggest a higher 
degree of association than actually exists 
Examph ix — The following data were observed for hybrids of 




in. — ASSOCIATION. 


37 


Datura (W Bateson and Miss Saunders, Eeport to the Evolution 
Committee of the Eoyal Society, 1902) — 

Flowers violet, fruits prickly {AB) . .47 

„ „ smooth {Aj^) . .12 

Flowers white, „ prickly laB) . .21 

„ „ smooth (aj5) . . 3 

Investigate the association between colour of flower and char- 
acter of fruit 

Since 3x47-141, 12x21 = 252, te. {AB) {aP)<{aB) (A^), 
there is clearly a ne gat ive association , 252- 141 = 111, and at 
first sight this considerable difiference is apt to suggest a consider- 
able association But 3= 111/83 = 1*3 only, so that m point of 
fact the association is small, so small that no stress can he laid 
on it as indicating anything hut a fluctuation of sampling 
Working out the percentages we have — 

Percentage of violet-flowered plants with \ or\ 

prickly fruits . . } 80 per cent. 

Percentage of white-flowered plants with ) g,, 
prickly fruits . . . . . | ^ 

13 While the methods used in the preceding pages suffice for 
nearly all practical purposes, it may be convenient to measure 
* the intensities of association m diflerent cases by means of some 
I formula or coefficient,^’ so devised as to be zero when the attri- 
ibutes are independent, + 1 when they are completely associated, 
and — 1 when they are completely disassociated, m the sense of 
'§6 If we use the term ‘‘complete association” m the wider 
sense there defined, we have, grouping the freq^uencies in fourfold 
tables, the thiee cases of complete association : — 

( 2 ) 

(JB) {A^) {A) 

0 («i8) (a) 



W O) N 


In the first case all Ah are B, and so (.4/3) = 0; in the second 
all jB’s are A and so (a^) = 0 , and in the third case we have (-4) = 



( 1 )" 


(^-B) 

0 

{A) 

(oB) 

(a^) 

(o) 

{B) 

O) 

N 






ss 


THEORY OF STATISTICS 


80 that all A's are B and also all B's are A The 
three corresponding cases of complete disassociation are — 

( 4 ) ( 5 ) ( 6 ) 


0 

(.^0) 

(A) 

(aB) 

0 

(«) 

(S) 


N 


{AB) 

{A») 

(A) 

(«J?) 

P 

(a) 

{S) 

(J3) 

iY 


0 

(A^) 

{A) 

(oB) 

(afl) 

(a) 

{B) 

05) 

iY 


It IS required to devise some formula which shall give the value 
+ 1 in the first three cases, - 1 in the second three, and shall 
also be zero where the attributes are independent Many such 
formulse may be devised, but perhaps the simplest possible (though 
not necessarily the most advantageous) is the expiession — 

iAB)ia^) + {Ali)XaB)- 

■' _ M 

"{AB){al3) + {A^){aB) 

— where § is the symbol used in the two last sections for the 
difference (AB) - (AB)q It is evident that Q is zcio when the 
attnbutesaremdependent, for then Sis zero: it takes the value +1 
when there is complete association, for then the second term in 
1 both numerator and denominator of the first form of the expression 
is zero, similarly it is - 1 where theie is complete disassociation, 
for then the first term in both numerator and denominator is 
zero. Q may accordingly bo tcinicd a coeffiment of amciatton. 
As illustrations of the values it will tako m certain cases, the 
association between deaf-mutism and imbecility, on the basis of the 
English census figures (Example vi.) is +0 91 ; between light eye 
colour in father and in son (Example vii.) + 0*66 ; between colour of 
flower and prickliness of fruit m Datura (Exaui])lc ix ) - 0*28, an 
association which, however, as already stated, is probably of no 
practical significance and due to mere fluctuations of sampling. 

The student should note that the value of Q for a given table 
is unaltered by multiplying either a row or a column by any 
arbitrary number, ^.e the value is independent of the relative 
proportions of A's and a's included in the table. This property 
IS of importance, and renders such a measure of association 
specially adapted to cases (e.y, experiments) m which the propor- 
tions are arbitrary A form possessing the same property but 
certain marked advantages over Q is suggested in ref. (3). 






in —ASSOCIATION. 


39 


The coefficient is only mentioned here to direct the attention 
of the student to the possibility of forming such a measure of 
association, a measure which serves a similar purpose in the case 
^ of attributes to that served by certain other coefficients in the 
cases of manifold classification (cf. Chap. V ) and of variables 
{cf Chap IX , and the references to Chaps X and XYL) For 
further illustrations of the use of this coefficient the reader is 
referred to the reference (1) at the end of this chapter^ for the 
modified form of the coefficient, possessing the same properties 
but certain advantages, to ref (3), and for a mode of deducing 
another coefficient, based on theorems in the theory of variables, 
which has come into more general use, though in the opinion of 
the present writer its use is of doubtful advantage, to ref (4) 
Keference should also be made to the coefficient described in § 10 
of Chap XL The question of the best coefficient to use as a* 
measure of association is still the subject of controversy . for a I 
discussion the student is referred to refs (3), (5), and (6). 

14 In concluding this chapter, it maybe well to repeat, for the 
sake of emphasis, that (c/. § 5) the mere fact of 80, 90, or 99 per ! 
cent of X’s being B implies nothing as to the association of A 
with B , in the absence of information, we can but assume that ^ 
80, 90, or 99 per cent of a’s may also be B In order to apply 
the criterion of independence for two attributes A and it is 
necessary to have information concerning a’s and /3’s as well as 
and j5’s, or concerning a universe that includes both a^s and 
X’s, /?’s and B^d, Hence an investigation as to the causal 
relations of an attribute A must not be confined to A\ but must 
be extended to a’s (unless, of course, the necessary information 
as to a^s IS already obtainable) * no comparison is otherwise 
possible It would be no use to obtain with great pains the 
result {cf. Example vi ) that 29 6 per thousand of deaf-mutes 
were imbecile unless we knew that the proportion of imbeciles 
in the whole population was only 1 5 per thousand ; nor would 
it contribute anything to our knowledge of the heredity of deaf- 
mutism to find out the proportion of deaf-mutes amongst the 
offspring of deaf-mutes unless the proportions amongst the off- 
spring of normal individuals were also investigated or known. 

REFERKNOES 

(1) Yule, G XT , “On the Association of Attributes in Statistics,” Phil. 

Trans. Roy Soc , Senes A, vol. cxciy , 1900, p 257 (Deals fully 
with the theory of association the association coefficient of § IS 
suggested ) 

(2) Yule, G XT , “Notes on the Theory of Association of Attributes m 

Statistics,” BiO'tnetrika, vol. ii , 1903, p 121 (Contains an abstract 
of the principal portions of (1) and other matter ) 



40 THEORY OF STATISTICS. 

(3) Yttle, G. XT., On the Methods ofMeasuring the Association between Two 

Attributes,” Jour JRoy Slat, Soc , vol Ixxv , 1912, pp 579-642. (A 
critical survey of the various coefficients that have been suggested for 
measuring association and their propeities a modified form of the 
coefficient of § 13 given which possesses maiked advantages.) 

(4) Peaesok, Kael, “On the Coirelation of Characters not Quantitatively 

Measurable,” Tram Roy Soc , Senes A, vol. cxcv , 1900, p 1. 
(Deals with the problem of measurement of intensity of association 
from the standpoint of the theory of vauables, giving a method which 
has since been laigely used only the advanced student will be able^ to 
follow the work. For a criticism see ref 3. ) 

(5) Peaeson, Kael, and David Heeof, “On Theories of Association,” 

Biomcirika^ vol. ix,, 1913, pp, 169-332. (A reply to criticisms m ref 3 ) 

(6) Geebnwood, M , and G U. Yule, “ The Statistics of Anti-typhoid and 

Anti-cholera Inoculations, and the interpretation of such statistics in 
general,” Proc. Roy, So<^ of Mc&icvm, vol viii , 1916, p. 113 (Cited 
for the discussion of association coefficients in § 4, and the conclusion 
that none of these coefficients are of mnoh value foi comparative pur- 
poses in interpreting statistics of the type considered ) 

(7) Lieps, G. F , “Die Bestimmung der Abhangigkext zwischen den Merkmal cn 

ernes Gegenstandes,” Benchte d, math -phys, Klassc d. "hgl saclmsclien 
QeseUsclmft d. Wissenschaften, Leipzig, Feb. 1905, (Deals with the 
general theory of the dependence between two chaiacters, however 
classified , the coefficient of association of § 18 is again suggested inde- 
pendently.) 

EXERCISES. 

1 At the census of England and Wales in 1901 there wore (to the nearest 
1000) 15,729,000 males and 16,799,000 females; 3497 males weio returned 
as deaf-mutes from childhood, and 3072 females. 

State proportions exhibiting the association between deaf-mutism fiom 
childhood and sex. How many of each sex for the same total number would 
have been deaf-mutes if there had been no association ? 

Show, as briefly as possible, whether A and B aio independent, posi- 
tively associated, or negatively associated in each of the following cases • — 


(o) 

N =6000 

M) = 

2350 

(B) =8100 

(/ri?)-.1600 

(6) 

{A) = 490 

{AB)^ 

294 

(ct) = 670 

(a/i)= 880 

0) 

{AB)= 266 

(<xJ?) = 

768 

48 

(aB) -» 144 


3. (Figures derived from Darwin’s Gross- and SelffertihsaUon of Blayits, 
cf ref 1, p. 294.) The table below gives the uunibors of plants of certain 
species that were above or below the aveiago height, stating separately those 
that were derived from cross fertilised and fiom self-fortilisod paientage 
Investigate the association between height and cross-fertilisation of paientage, 
and draw attention to any special points you notice. 


Species 

Parentage Cross fer- 
tilised Height— 

Parentage Self fer- 
tilised. Height- 

Above 

Average 

Below 

Average. 

Above 

Average. 

Below 

Average 

Ipomaja purpurea . 

68 

10 

18 

66 

Petunia violacea .... 



IS 

64 

Reseda lutea 



11 

21 

Reseda odorata 



26 

80 

Lobelia fulgens 


■H 

12 

22 





III. — ^ASSOCIATION. 


41 


4. (Figures from same source as Example vii. p. 83, but matenal dififerently 
grouped , classes 7 and 8 of the memoir treated as “dark ”) Investigate the 
association between darkness of eye colour in father and son from the following 
data . — 

Fathers with dark eyes and sons with dark eyes {AB ) . 60 

,, ,, ,, not dark eyes {A$) » 79 

Fathers with not dark eyes and sons with dark eyes {aB) . 89 

,, ,, ,, not-dark eyes {a0) . 782 

Also tabulate for comparison the frequencies that would have been observed 
had there been no heredity, % e, the values of {AB)^, (AjS)^, etc (§ 11} 

5 (Figures fiom same source as above ) Investigate the association between 
eye colour of husband and eye colour of wife ( “ assortative mating”) from 
the data given below 

Husbands with light eyes and wives with light eyes (AB) , 309 

,, ,, „ not-hghte^es (A0) . 214 

Husbands with not-light eyes and wives with hght eyes (oB) . 132 

„ ,, „ not-light eyes (a^) . 119 

Also tabulate for comparison the frequencies that would have been observed 
had theie been strict independence between eye colour of husband and eye 
colour of wife, z c. the values of (AB)^, etc , as m question 4. 

6. (Figures from the Census of England and Wales ^ 1891, vol. in : the data 
cannot be regaided as trustworthy.) The figures given below show the 
number of males in successive age groups, together with the number of the 
blind (A\ of the mentally-deranged (J3), and the blmd mentally-deranged 
(AB\ Trace the association between blindness and mental derangement 
from childhood to old age, tabulating the proportions of insane amongst the 
whole population and amongst the blind, and also the association coefficient 
Q of § 13 Give a short veibal statement of your results. 



&- 

15- 

25- 

36- 

45- 

66- 

66- 

75 and 
upwards 

N 

3,304,230 

2,712,621 

2,089,010 

1,011,077 

1,191,789 

770,124 

444,890 

161,692 

W 

844 

1,184 

1,165 

1,601 

1762 

1,906 

1,932 

1,701 

(B) 

2,820 

6,225 

8,482 

9,214 I 

8,187 

6,799 

8,412 

1,098 

(m 

17 

19 

19 

31 

32 

84 

22 

9 


7. Show that if 

(AB)^ (a5)i {Afi)i (a^)i 

{AB)2 (nB)^ {Ap>)2 (®^)2 

be two aggregates corresponding to the same values of {A\ (J?), (a), and (^8), 
(AB)i — {AB)>2=^{aB)2 ~ — {A^)i^{a.0)i — (a3)2» 

8. Show that if 

d=(AB)^{AB\ 

(ABf + (a^)2 - (oBf - (A^f= 1{A ) ~ ia)][(B) - (^8)] + 2N. 5 

9. The existence of association may be tested either by comparison of pro- 
portions ie,g, {AB)I(B) with {AWil(W))f as m §§ 9, 10, or by the value of S, as 
m §§ 11, 12, Show that 





CHAPTER IV. 


PARTIAL ASSOCIATION. 

1-2. Uncertainty in interpretation of an observed association — 3-5 Soiiice of 
the ambiguity paitial associations — 6-8 Illusoiy association due 
to the association of each of two attnbiites with a third — 9 Estima- 
tion of the partial associations from the frequencies of the second 
order — 10-12 The total number of associations for a given number 
of attributes — 13-14, The case of complete independence. 

1. If we find that m any given case 
{AB)> or < 

all that IS known is that there is a relation of some soit or kind 
between A and B The result by itself cannot toll as whether 
the relation is direct, whethei possibly it is only dno to “ fluctuations 
of sampling’^ {cf Chap III. 7-8), or whether it is of any other 
particular kind that we may happen to have in our minds at the 
moment Any interpretation of the meaning of tlio association is 
necessarily hypothetical, and the number of possible alternative 
hypotheses is m general considerable. 

2 The commonest of all forms of alternative hypothesis is of 
this kind : it is argued that the relation between the two attributes 
A and B is not direct, but due, in some way, to the association of 
A with C and of B with (7. An illustration or two will make the 
matter clearer : — 

(1) An association is observed between “vaccination'' and 
“ exemption from attack by smalbpox," % e. more of the vaccinated 
than of the unvacemated are exempt from attack. It is argued 
that this does not imply a protective effect of vaccination, but is 
wholly due to the fact that most of the unvaccinated are drawn from 
the lowest classes, living m very unhygienic conditions. Denoting 
vaccination by A, exemption from attach by J?, hygienic conditions by 
(7, the argument is that the observed association between A and B 
IS due to the associations of both with G 

42 




rV. — PARTIAL ASSOCIATION. 


43 


(2) It is observed, at a general election, that a greater 
proportion of the candidates who spent more money than their 
opponents won their elections than of those who spent less It 
IS argued that this does not mean an influence of expenditure on 
the result of elections, but is due to the fact that Conservative 
principles generally carried the day, and that the Conservatives 
generally spent more than the Liberals Denoting winmnghj A, 
spending more than the opponent by 7?, and Conservative by <7, the 
argument is the same as the above {cf. Question 9 at the end of 
the chapter). 

(3) An association is observed between the presence of some 
attribute in the father and its presence in the son; and also 
between the presence of the attribute in the grandfather and its 
presence in the grandson Denoting the presence of the attribute 
in son, father, and grandfather by A, and C, the question arises 
whether the association between A and C may not be due solely 
to the associations between A and -S, B and (7, respectively 

3 The ambiguity m such cases evidently arises from the fact 
that the universe of observation, in each case, contains not 
merely objects possessing the third attribute alone, or objects 
not possessing it, but both 

If the universe were restricted to either class alone the given ' 
ambiguity would not arise, though of course others might remain 

Thus, in the first illustration, if the statistics of vaccination 
and attack were drawn from one narrow section of the population 
living under approximately the same hygienic conditions, and an 
association were still observed between vaccination and exemption 
from attack, the supposed argument would be refuted. The fact 
would prove that the association between vaccination and 
exemption could not be wholly due to the association of both with 
hygienic conditions 

Again, in the second illustration, if we confine our attention to | 
the “ universe ” of Conservatives (instead of dealing with candidates 
of both parties together), and compare the percentages of Conserva- 
tives winning elections when they spend more than their opponents 
and when they spend less, we shall avoid the possible fallacy. If 
the percentage is greater in the former case than in the latter, it 
cannot be for the reasons suggested in § 2 

The biological case of the third illustration should be similarly 
treated If the association between A and C be observed for 
those cases in which all the parents, say, possess the attribute, or 
else all do not, and it is still sensible, then the association first 
observed between A and <7 for the whole universe cannot have 
been due solely to the observed associations between A and j5, B 
and (7. 



44 


THEORY OF STATISTICS, 


4 The associations observed between the attributes A and B 
in the universe of G^s and the universe of y’s nuay be termed 
partial associations, to distinguish them from the total associations 
observed between A and J? m the universe at large In terms of 
the dehnition of § 5 of Chap III , ^ and 7J will be said to be posi- 
tively associated m the universe of C’s {cf § 4 of Chap. II ) when 


{ABG)> 


(AC)(BC) 

(G) 


( 1 ) 


and negatively associated in the converse case. 

As in the simpler case, the association is most simply tested by 
a compaiison of percentages or proportions (§ 9, Chap. Ill ), 
although for some purposes a ^‘coefficient of association” of 
some kind may be useful Confining our attention to the more 
fundamental method, if A and B are positively associated within 
the universe of C% we must have, to quote only the four most 
convenient compaiisons, 


( ABC) (AG) 
(BC) > (C) 
(ABC) (A/SC) 
(BG) ^ (/3C) 


(a) 

(c) 


(ABC) (BG) 
(AG) ^ (G) 
(ABC) (oBG) 
li(7) ^ (a(7) 



( 2 ) 


These inequalities may easily be rewritten for any other case by 
making the proper substitutions in the symbols , thus to obtain 
the inequalities foi testing the association between A and G in 
the universe of B\ B must be written for G, /3 for y, and vice 
versd, throughout, it being remembeicd that the order of the 
letters in the class symbol is immaterial. The remarks of § 10, 
Chap III., as to the choice of the compaiison to bo used, apply of 
course equally to the present case. 

5. Though we shall confine ourselves m the present work to 
the detailed discussion of the case of three attributes, it should be 
noticed that precisely similar conceptions and formula) to the 
above apply m the general case whore more than three attributes 
have been noted, or where the relations of more than three have 
to be taken into account. If, when it is observed that A and B 
are still associated within the universe of G% it is argued that 
this IS due to the association of both A and B with D, the argu- 
ment may be tested by still further limiting the field of observa- 
tion to the universe CD. If 

A and B are positively associated within the universe of CJ)% 
and the association cannot be wholly ascribed to the presence and 



IV. — PARTIAL ASSOCIATION. 


45 


absence of D as snggested, nor to the presence and absence of 
G and D conjointly If it be then aigued that the presence 
and absence of £! is the souice of association, the process may 
be repeated as before, the association of A and B being tested 
for the universe CDE^ and so on as far as practicable 

Partial associations thus form the basis of discussion for any 
case, however complicated The two following examples will 
serve as illustrations for the case of three attributes. 

Example i — (Material from ref 5 of Chap L) 

The followmg are the proportions per 10,000 of boys observed 
with certain classes of defects, amongst a number of school 
children (A) denotes the number with development defects, (B) 
with nerve-signs, {D) the number of the “ dull.” 


S' 

10,000 

(A£) 

33S 

U) 

877 

hi)) 

338 

(4 

1,086 

im 

455 

P) 

789 

(ABB) 

153 


The jRejpori from which the figures are diawn concludes that ‘‘the 
connecting link between defects of body and mental diilness is 
the coincident defect of biain which may be known by observation 
of abnormal nerve-signs.” Discuss this conclusion. 

The phiase “connecting link” is a little vague, but it may 
mean that the mental defects indicated by nerve-signs B may 
give rise to development-defects A, and also to mental-dul- 
ness D ; ^ being thus common effects of the same cause 

J? (or another attribute necessarily indicated by 5), and 'not 
directly influencing each other The case is thus similar to that 
of the first illustration of § 2 (liability to small-pox and to non- 
vaccination being held to be common effects of the same circum- 
stances), and may be similarly treated by investigation of the 
partial associations between A and D for the imi verses B and ^ - 
As the ratios (A)IE', (i>)/A^are small, comparisons of the 

form (4) (h) of Chap III (p 31), or (2) {a) (b) above, may very 
well be used {cf the lemaiks m § 10 of the same chapter, 
p. 31) 

The following figures illustrate, then, the association between 
A and E for the whole universe, the ^-universe and. the /3- 
universe — 


For the entire material • — 

Proportion of the dull — (D)/A^ . 

,, ,, defectively developed who 

were dull = (^Z))/(^) 


789 

10,000 

338 

877 


= 79 per cent 


= 38 5 



46 


THEORY OF STATISTICS. 


For those exhibiting nerve signs ; — 
Propoition of the dull — (-&!>/( J5) 

,, ^ defectively developed who 

were dull . 


}- 


455 

1,086 

163 

338 


= 41 ’9 per cent 


= 45 3 


For those not exhibiting nerve signs — 

Proportion of the dull = (j8Z?)/(j8) , ^ "01 4 ~ ^ ^ 

,, ,, defectively developed who \ 186 ^ 

were dull =(-^j3X>)/(^i3) . . J 539 


The results are extremely sti iking j the association between A 
and D is very high indeed both for the material as a whole (the 
universe at large) and for those not exhibiting nerve-signs (the 
^-universe), but it is very small for those who do exhibit nerve- 
signs (the ^-universe). 

This result does not appear to be m accord with the conclusion 
of the Repc/rt^ as we have intei preted it, for the association 
between A and D m the yS-universe should in that case have 
been very low instead of very high 

Example n — Eye-colour of grandparent, parent and child 
(Material from Sir Francis Galton’s Natural Inheritance (1889), 
table 20, p 216. The table only gives particulars for 78 large 
families with not less than 6 brotheis or sisters, so that the 
material is hardly entirely repiesentative, but serves as a good 
illustration of the method ) The original data are tieated as in 
Example vii of the last chapter (p 33). Denoting a light-eyed 
child by Aj parent by R, grandparent by (7, every possible line of 
descent is taken into account Thus, taking the following two 
lines of the table, 

Childieii Parents Grandparents 

A, a B, j8. V. 7 

llght-eyed LIght-e>ed LlgW eyed 

4 5 11 1 

3 4 11 4 0 


the first would give 4 x 1 x 1 = 4 to the class A BG, 4x1x3 = 12 to 
the class ABy^ 4 to A(^G, 12 to Apy^ 5 to 15 to a/iy, 5 to 
and 15 to a/^y, the second would give 3xlx4»=12to the 
class ABC, 12 to A^C, 16 to aBC^ 16 to a^G, and none to the re- 
mainder. The cldss-fiequencies so derived from the u hole table are, 


(ABC) 

1928 

(oBC) 

303 

{ABy) 

{A^G) 

596 

652 

Mr) 

(al3G) 

225 

395 

(A^y) 

608 

Mr) 

601 



IT. — PARTIAL ASSOCIATION 


47 


The following comparisons indicate the association between 
grandparents and parents, parents and children, and grand- 
parents and grandchildren, respectively : — 


Grandparents and Parents 


Proportion of light-eyed amongst the \ 
children of light-eyed grandparents J 
Propoition of light-eyed amongst the'l 
children ot not-light-eyed giand- I 
paients , , J 


__iBCr)_22Sl 


(7) ■ 


821 

7830 


= 44 9 


»» 


Parents and Children, 

Proportion of light-eyed amongst the ] _{AB) 2524 
children of hght-eyed paients / ”3052^^^'^ 

Proportion of hght-eyed amongst the 1 _1060 

children of not-light-eyed parents . j (^q) ~~1956’”^^ ^ 

In both the above cases we are leally dealing with the 
association between paient and offspimg, and consequently the 
m ensity of association is, as might be expected, approximately 
the same , in the next case it is naturally lower 


Gi andparents and Qrandcliildren, 


Proportion of light eyed amongst the 
giandchildren of light-eyed grand- 
paients 

Proportion of light-eyed amongst the ' 
grandchildren of not-light-eyed 
grandpaients 


(^ 6^_2480 . ^ ^ 

(C7) 3173 “ 78*0 per cent 


(7) ’ 


,1104 
7830 " 


60 3 




We proceed now to test the partial associations between giand- 
parents and giandchildren, as distinct from the total associations 
given above, in order to throw light on the real nature of the 
resemblance. There aie two such partial associations to be 
tested * (1) where the parents are light-eyed, (2) where they are 
not-light-eyed. The following are the compaiisons . — 


Grandparents and Giandchildren Parents hght-eyed. 


Proportion of light-eyed amongst the 
grandchildren of light-eyed grand- 
parents 


_ {AJBG ) 1928 
[BC) ■“2231"'^^ ^ 


Proportion of light-eyed amongst the 
grandchildren of not-light-eyed 
grandparents , 


1 _ (ABy) _ 596 

J {By) 821 


= 72 6 


II 



THEORY OF STATISTICS. 


Grandparents and Grandchildren Parents not-light-eyed 

Proportion of light-eyed amongst the {A^C) 552 

grandchildren of light-eyed grand- h = ^ *q^ = 58 3 pei cent, 

parents . . . J 

Proportion of light-eyed amongst the 'j ^ ^ 5 03 

grandchildren of not-light-eyed V = ~ 50 *3 ,, 

grandpaients . . . . j 

In both cases the partial association is quite well-marked and 
positive j the total association between grandparents and grand- 
children cannot, then, be due wholly to the total associations 
between grandparents and parents, parents and children, re- 
spectively. There is an ancestral heredity^ as it is termed, as 
well as a parental heiedity. 

We need not discuss the partial association between children and 
parents, as it is compai*atively of little consequence. It may be 
noted, however, as regards the above results, that the most 
important feature may be brought out by stating three ratios 
only. 

If A and B are positively associated, {AB)\{B)>{A)\P[. 

If A and G are positively associated in the universe of i?^s, 
{ABO)l{BC) > (AB)I(B). Hence {A)/F, (AB)I{B), and {ABC)I{BC) 
form an ascending series Thus we have fiom the given data — 

('<)/^ -nsp.,™.. 

Proportion of light-eyed amongbt the 1 _ / a -qsu on.#, 
children of hght-eyed parents J - =82 7 „ 

Proportion of light-eyed amongst the') 
children of light eyed parents and j- =(^7y6')/(i?O) = 80*4 „ 

grandparents . . J 

If the great-grandparents, etc , etc., were also known, the series 
might be continued, giving {ABCD)I{BCD\ {ABGJ)E)I{BGDE)^ 
and so forth The series would probably ascend continuously 
j though with smaller intervals, A and D being positively associated 
' in the universe of jB(7^s, A and E m the universe of i?(7Z>^s, etc 

6. The above examples will serve to illustrate the practical 
application of partial associations to concrete cases. The geneial 
nature of the fallacies involved in interpreting associations 
between two attributes as if they were necessarily duo to the 
most obvious form of direct causation is more clearly exhibited 
by the following theorem . — 

If A and B are independent within the universe of G^s and also 
within the universe of y^s, they will nevertheless he associated 
within the universe at large, unless G is independent of either A 
or B or both. 



IV. — PARTIAL ASSOCIATION. 


49 


The two data give — 


UBy) =-(MM [W-(ig)ir(^)-(^g)l 

^ (y) (r) 

Adding them together we have — 

"(W) { - (5XO(^0) + (^)(5)(a) } 

Write, as in § 11 of Chap III (p 35)— 

(^«).=2p, 


( 3 ) 


N 


subtract {AB)^^ from both sides of the above equation, simplify, 
and we have 


{AB)-{AB),^j^[{A(T)-(AC\][{BC)-{BC\^ . (4) 

This proves the theorem; for the right-hand side will not be 
zero unless either {AC) = {AG)^ or {BC) = {BQ)^. 

7. The result indicates that, while no degree of heterogeneity 
in the universe can influence the association between A and B 
if all other attributes are independent of either J. or .5 or both, 
an illusory or misleading association may arise in any case where 1 
there exists in the given universe a third attribute C with which | 
both A and B are associated (positively or negatively). If both 
associations are of the same sign, the resulting illusory association 
between A and B will be positive , if of opposite sign, negative. 
The three illustrations of § 2 are all of the first kind In (1) it 
IS argued that the positive associations between vaccination and 
hygienic coTiditions, exemption from attach and hygieme conditions^ 
give rise to an illusory positive association between vaccination 
and exemption from attach In (2) it is argued that the positive 
associations between conservative and winning^ conservative and 
spending more, give rise to an illusory positive association between 
winning and spending more In (3) the question is raised whether 
the positive association between grandparent and grandchild may 
not be due solely to the positive associations between grandparent 
and parent, parent and child. 

Misleading associations of this kind may easily arise through 



60 


THEORY OF STATISTICS. 


the mingling of records, e,g, respecting the two sexes, which a 
careful worker would keep distinct 

Take the following case, for example Suppose there have been 
200 patients in a hospital, 100 males and 100 females, suffering 
from some disease Suppose, fuither, that the death-rate for males 
(the case mortality) has been 30 per cent , for females 60 per cent 
A new treatment is tiled on 80 pel cent, of the males and 40 per 
cent of the females, and the lesults published without distinction 
of sex. The three attributes, with the relations of which we' are 
here concerned, are deaths treatment and male sex. The data show 
that more males were treated than females, and more females 
died than males , therefore the first attribute is associated nega- 
tively, the second positively, with the third It follows that there 
will be an illusory negative association between the first two — 
death and treatment If the tieatment were completely inefficient 
we would, m fact, have the following lesults • — 



Males. 

Females. 

Total. 

Treated and died . * . 

24 

24 

48 

„ and did not die 

66 

16 

72 

Not treated and died . 

6 

36 

42 

„ and did not die . 

14 

24 

38 


i e, of the treated, only 48/120 = 40 per cent died, while of those 
not treated 42/80 = 52*6 per cent. died. If this result were stated 
without any reference to the fact of the mixture of the sexes, to 
the different proportions of the two that were treated and to the 
different death-rates under normal tieatment, then some value in 
the new treatment would appear to be suggested. To make 
a fair return, either the results for the two sexes should be 
stated separately, or the same proportion of the two sexes 
must receive the experimental treatment. Further, care would 
have to be taken in such a case to see that there was no 
selection (perhaps unconscious) of the less severe cases for treat- 
ment, thus introducing another source of fallacy {death positively 
associated with severity, treatment negatively associated with 
severity, giving rise to illusory negative association between 
treatment and death), 

A misleading association between the characters of parent and 
offspring might similarly be created if the records for male-male 
and female-female lines of descent were mixed Thus suppose 60 
per cent, of males and 10 per cent, of females exhibit some 
attribute for which there is no association in either line, then we 
would have for each line and for a mixed record of equal 
number^ — ^ 



IV — PAKTIAL ASSOCIATION. 


51 


Male hue. Female line. 

Parentswith attribute and ) ok x i x 

children with . } 25 per cent 1 per cent. 

Parentswith attribute and ) q 

children without . . j ^ ” 

Parents without attribute ) q 

and children with , j ” ” 

Parents without attribute lop. 
and children without . } ” ” 


Mixed record, 
1 3 per cent. 


17 




17 


>> 


53 




Here 13/30 = 43 per cent of the offspring of parents with the 
attribute possess the attribute themselves, but only 17/70 = 24 
per cent, of the offspring of parents without the attribute. The 
association between attribute in parenZ and attribute in offspring^ 
is, however, due solely to the association of both with male sex. 
The student will see that if records for male-female and female- ' 
male lines were mixed, the illusory association would be negative, 
and that if all four lines were combined there would be no illusory 
association at all, 

8 Illusory associations may also arise in a different way 
through the personality of the observer or observers If the 
observer’s attention fluctuates, he may be more likely to notice 
the presence of A when he notices the presence of and vice 
versd ; m such a case A and B (so far as the record goes) will both” 
be associated with the obseiver’s attention (7, and consequently 
an illusory association will be created Again, if the attributes 
are not well defined, one observer may be more generous than 
another in deciding when to record the presence of A and also 
the presence of J?, and even one observer may fluctuate in the 
generosity of his marking. In this case the recording of A and 
the recording of B will both be associated with the generosity 
of the observer in recording their presence, O, and an illusory 
association between A and B will consequently arise, as 
before 

9 It IS important to notice that, though we cannot actually 
determine the partial associations unless the third^order” frequency 
(J3{7) is^^yen, we can make some conjecture as to their sign 
from the ^ues of the second-order frequencies. 

Suppose, for instance, that — 



62 


THEOEY OF STATISTICS. 


80 that Sj and Sg are positive or negative according as A and JB 
are positively or negatively associated in the universes of 0 and 
y respectively Then we have by addition — 

+ . . ( 6 ) 

Hence if the value of (AJB) exceed the value given by the first 
two teims (i e if 8^ + Sg he positive), A and must be positiyely 
associated either m the universe of 0% the universe of y’s, or 
both. If, on the other hand, (AB) fall short of the value given by 
the first two terms, A and B must be negatively associated in 
the universe of 0% the universe of y^s, or both. Finally, if 
(AB) be equal to the value of the first two terms, A and B must 
be positively associated in the one partial universe and negatively 
m the other, or else independent m both 

The expression (6) may often be used in the following form, 
obtained by dividing through by, say, (B ) — 


(B) - (G) • (£) + (y) • (B) +~(By ’ 


. ( 7 ) 


In using this expression we make use solely of proportions or 
percentages, and judge of the sign of the partial associations 
between A and B accordingly. A conoiete case, as in Example iii. 
below, is perhaps clearer than the general foimula. 

Example lii — (Figures compiled from Supplement to the Fifty- 
fifth Annual Report of the Eegist'f ar- General [C — 8503], 1897 ) 
The following are the death-rates per thousand per annum, and the 
proportions over 65 yeais of age, of occupied males in general, 
farmers, textile workers, and glass workers (over 15 years of age 
in each case) during the decade 1891-1900 in England and Wales. 


Occupied males over 15 
Farmers „ „ 

Textile workers, males over 16 
Glass workers „ „ 


Death-rate 
pel tliousaud, 

158 
. 196 

15-9 
166 


Fiopoition 
per thousand 
over 65 Years 
of Af^e 
46 
132 
34 
16 


Would farming, textile working, and glass working seem to be 
relatively healthy or unhealthy occupations, given that the death- 
rates among occupied males from 15-65 and over 66 years of age 
are 11 5 and 102*3 per thousand respectively? 

If A denote deaths B the given occupation^ C old age^ we have 



IV. — PARTIAL ASSOCIATION. 


53 


to apply the principle of equation (7). Calculate what would be 
the death-rate for each occupation on the supposition that the 
death-rates for occupied males in general (115, 102 3) apply to 
each of its separate age-groups {under 65, over 65), and see 
whether the total death-rate so calculated exceeds or falls short 
of the actual death-rate If it exceeds the actual rate, the 
occupation must on the whole be healthy , if it falls short, un- 
healthy Thus we have the following calculated death-rates : — 

'Farmers. . . 11-5 x *868 + 102 3 x 132 = 23-5. 

Textile workers . 11 5 x 966 -t- 102-3 x 034 = 14*6. 

Glass workers . 11 5 x 984 + 102 3 x 016 = 13 0. 

The calculated rate for farmers largely exceeds the actual rate ; 
farming, then, must on the whole, as one would expect, be, 
a healthy occupation The death-rate for either young farmers ‘ 
or old farmers, or both, must be less than for occupied males in 
general (the last is actually the case) ; the high death-rate 
observed is due solely to the large proportion of the aged Textile , 
workings on the other hand, appears to be unhealthy (14-6 <15 9),' 
and glass working still more so (13 0<16 6) , the actual low total 
death-rates are due merely to low proportions of the aged. 

It IS evident that age-distributions vary so largely from one 
occupation to another that total death-rates are liable to be very 
misleading — so misleading, in fact, that they are not tabulated at all 
by the Registrar-General , only death-rates for narrow limits of age 
(5 or 10 year age-classes) are worked out. Similar fallacies are 
liable to occur m comparisons of local death-rates, owing to 
variations not only in the relative proportions of the old, but also 
m the relative proportions of the two sexes 

It is hardly necessary to observe that as age is a variable quantity, 
the above procedure for calculating the comparative death-rates 
is extremely rough The death-rate of those engaged in any occu- 
pation depends not only on the mere proportions over and under 
65, but on the relative numbers at every single year of age The 
simpler procedure brings out, however, better than a more complex 
one, the nature of the fallacy involved in assuming that crude death- 
rates are measures of healthiness^ [See also Chap XI §§ 17-19 ] 
Example iv. — Eye-colour in grandparent, parent and child. 
(The figures are those of Example ii ) 

X, light-eyed child , light-eyed parent , (7, light-eyed grand- 
parent. 


iF=5008 


= 2524 



54 


THEORY OF STATISTICS. 


Given only the above data, investigate whether there is probably 
a partial association between child and grandparent 

If theie were no partial association we would have — 

(AB)(BO (APX0C) 

{AG)- 

2624x2231 1060x947 

- 3052 1956 

= 1845 0 + 513-2 
= 2358-2. 

Actually (AG) = 2480, there must, then, be partial association 
either in the ^-univeise, the /?-univeise, or both In the absence 
of any reason to the contrary, it would be natural to suppose there 
is a partial association m both, i.e, that there is a partial 
association with the grandparent whether the line of descent 
passes through light-eyed ” or ** not-light-eyod ” parents, but this 
could not be proved without a knowledge of the class-frequency 
{ABC) 

10. The total possible number of associations to be derived from 
n attributes grows so rapidly with the value of n that the evalua^ 
tion of them all for any case in which n is greater than four 
becomes almost unmanageable For three attributes there are 9 
possible associations — three totals, thiee partials in positive 
^ universes, and thiee partials in negative universes. For four 
attributes, the number of possible associations rises to 64, 
for there are 6 pairs to be formed from four attributes, and 
we can find 9 associations for each pair (1 total, 4 partials 
with the universe specified by one attribute, and 4 partials 
with the universe specified by two). For five attributes the 
student will find that there are no less than 270, and for six 
attributes 1215 associations 

As suggested by Examples i. and ii above, however, it is not 
necessary m any actual case to investigate all the associations 
that are theoretically possible ; the nature of the problem indicates 
those that are required 

I In Example i., for instance, the total and partial associations 
between A and D were alone investigated ; the associations between 
A and B, B and JD were not essential for answering the question 
that was asked. In Example li , again, the three total associations 
and the partial association between A and C were worked out, 
but the partial associations between A and B and C were 
omitted as unnecessary Practical considerations of this kind will 
always lessen the amount of necessary labour. 



IV. — PARTIAL ASSOCIATION. 


56 


11. It might appear, at first sight, that theoretical considera- 

tions would enable us to lessen it still fuither As we saw in^ 
Chapter I , all class-frequencies can be expressed m terms of those 
of the classes, of which there are 2” in the case of n 

attributes. For given values of the 1 frequencies iF, (A), {B)^ 
(C), ... of order lower than the second, assigned values of the 
positive class-frequencies of the second and higher ordeis must 
therefore correspond to determinate values of all the possible 
associations But the number of these positive class-frequencies 
of the second and higher orders is only 2” -n+1 ; therefore the 
number of algebraically independent assoctatiom that can be 
derived from n attributes is only For successive 

values of n this gives — 

n 2”-'7r+l 

2 1 

3 4 

4 11 

5 26 

6 57 

Hence if we give data, m any form, that determine four 
associations in the case of three attributes, eleven in the case of 
foui attributes, and so on, in addition to N' and the class-frequencies 
of the first order, we have done all that is theoretically necessary 
The remaining associations can be deduced 

12. Practically, however, the mere fact that they can be deduced 
13 of little help unless such deduction can be effected simply, 
indeed almost directly, by mere mental arithmetic almost, and 
this IS not the case The relations that exist between the ratios 
or differences, such as {AB) - {AB)^^ that indicate the associations 
are, in fact, so complex that an unknown association cannot be f 
determined from those that are given without more or less lengthy ! 
work , it is not possible to infer even its sign by any simple 
process of inspection We have, for instance, from (6), by the 
process used in obtaining (4) for the special case of § 6 — 

which gives us the difference of (ABy) from the value it would 
have if A and B were independent in the universe of y s in terms 
of the difference of {ABC) from the value it would have if A and 



66 


THEOKY OF STATISTICS. 


iwT ““i^erse of C-’s, and the corresponding 

differences for the frequencies (AB), (AC), and (BC) The four 
quantities in the brackets on the right represent, say, the four 

^T’ unknownLociation 

aearly, the relation is not of such a simple kind that the term on 
the left can be, in general, mentally evaluated Hence in con- 
sidering the choice and number of associations to bo actiiallv 

~tke- 

It follows, in the first place, that all other nnioiWo ^ 

must be zero, t e that a state of complete mdenendpucp 
».r t.™ ,t, „S„. S.pp,», for .nstfnL 


■■ 


N 


(AB) 


(BG) = ^^^ 


(AC)^ 


- W(g) 


Jf 


Then it follows at once that we have also- 


(ABO) = _ (^XS)(0 

(C) Jpi ■ 


(ABO) = (^^)(^<^) _ B)(A 0) 
’ {B) (A)-' 


1 e. A and G are independent in the univeise of Tt'a n j 
m the universe of ^’s Again, of s, and B and G 

(ABy) = (AB) - (ABC) = (^(1) _ Wm 
N -£fi~ 


N 

A^XB)(y) (Ay)(By) 
(y)—- 


(B) (G) 

^ ~w--wy ■ . . ( 8 ) 

form of the equatfo?S^indepe2LeT2w'S^*^^^ 

14. It must he noted, howe^ver. 



IV. — PARTIAL ASSOCIATION. 57 

complete independence of -4, and C in the sense that the 
equation 

N N N 

IS a criterion for the complete independence of A and B, If we 
are given (-4), and (^), and the last relation quoted holds 
good, we know that similar lelations must hold for {A^\ {oiB), 
and (a/S) If iV, (A), (B), and (C) he given, however, and the 
equation (8) hold good, we can draw no conclusion without 
further information , the data are insufficient. There are eight 
algebiaicallj independent class-frequencies in the case of three 
attributes, while iV, (-4), {B\ (C) are only four * the equation (8) 
must theiefore be shown to hold good iov four frequencies of the 
third order before the conclusion can be drawn that it holds good 
for the remainder, i,e, that a state of complete independence 
subsists. The direct verification of this result is left for the 
student 

Quite generally, if F, {A), {B\ (G), . . .be given, the relation 

•) W (J) .0^ 

IT ~ F ' JUf If 

must be shown to hold good for 2” - u -l- 1 of the ?ith order classes 
before it may be assumed to hold good for the remainder. It is 
only because 

2”~55r+i=i 


when 71 = 2 that the relation 


N N ' If’ 


may be treated as a crptey ion for the independence of A and B 
If all the n {n>T) attiibutes are completely independent, the 
relation (9) holds good ; but it does not follow that if the relation 
(9) hold good they are all independent. 


REFERENCES 

(1) Yule, G U., ^‘On the Association of Attributes in Statistics,” Phil, 

Tians Roy Soc , Senes A, vol. cxciv., 1900, p 257. (Deals fully 
with the theoiy of partial as well as of total association, with numeious 
illustrations a notation suggested for the partial coejficients. ) 

(2) Yule, G U , “Notes on the Theory of Association of Attributes m 

Statistics,*’ Biometnlca, vol. u , 1903, p 121. {Of, especially §§ 4 and 
5, on the theory of complete independence, and the fallacies due to 
mixing of records.) 



58 


THEORY OP STATISTICS. 


EXERCISES. 

1. Take the following figmes for giils corresponding to those for boys in 
Example i , p 45, and discuss them similaily, but not necessarily using 
exactly the same comparisons, to sec whether the conclusion that “ the 
connecting link between defects of body and mental dulnoss is the coincident 
defect of biain which maybe known by obseivafcion of abnormal nerve signs” 
seems to hold good 

Aj development defects By nerve signs. Dy mental dulnoss 


iV 

10,000 

(AB) 

248 

(A) 

682 

UD) 

307 

(S) 

860 

{BD) 

363 

m 

689 

(ABB) 

128 


2 (Material from Census of England and JValeSy 1891, vol. iii ) The 
following figures give the numbers of those sulfeimg from single or combined 
infirmities (1) for all males, (2) for males of 55 years of age and over. 

Ay Blmdness B, Mental derangement Cy Deaf-mutism 



(1) 

(2) 


0) 

(2) 


All Males, 

Males 55- 


All Males 

Males 

JV 

14,053,000 

1,377,000 

{AB) 

183 

66 

{A) 

12,281 

6,538 

{AC) 

51 

14 

(B) 

45,392 

10,809 

iBC) 

299 

47 

(CO 

7,707 

746 

{ABC) 

11 

8 


Tabulate proportions per thousand, exhibiting the total association between 
blmdness and mental derangement, and the paitial associatujn between the 
same two infirmities among deaf-mutes, (1) for males in general, (2) for those 
of 65 years of age or over Give a short verbal statement of the results, and 
contrast them with those of Question 1 
3^ (Material from supplement to 55th Annual Report Reg.-Genl ) 

The death-rate from cancer for occupied males in genoial (over 15) is 
0*686 per thousand per annum, and for farmers 1 20 
The death-rates from cancer for occupied males under and over 46 respec- 
tively are 0*13 and 2*25 respectively. Of the farmers 46 1 per cent, are over 
45 

Would you say that farmers were peculiarly liable to cancer % 

4. A population of males over 16 years of age consists of 7 per cent, over 66 
years of age and 93 per cent under The death-rates are 12 per thousand per 
annum in the younger class and 110 m the older, or 18*86 in the whole 
population. The death-rate of males (over 16) engaged m a certain industry 
IS 26 7 per thousand. ^ 

If the industiy be not unhealthy, what must be the approximate proportion 
of those oyer 66 engaged m it (neglecting mmoi differences of age 
distnhution) ? ® 

6. Show that if A and B are independent, while A and (7, B and C are 
Msociated, A and B must be disassociated either m the universe of (Ts 
the universe of ys, or both. ' 

6. As an illus^ation of Question 6, show that if the following were actual 
there would be a slight disassooiatioa between the eye-ooloura of 
husband and wife (father and mother) for the parents either of light eyed 
sons or not-hght-eyed sons, or both, although there is a alight positive 
association for parents at laige. ° ^ 



59 


IV — PARTIAL ASSOCIATION, 
A light-eye colour in husband, B m wife, 0 m son— 


N 

1000 

(AB) 

358 

U) 

622 

(AO) 

471 

iS) 

658 

(BG) 

419 

{C) 

617 



7. Show that if {ABC) = ia^y)^ {aBC)—{A^y\ and so on (the case of 
“complete equality of contraiy frequencies” of Question 7, Chap L), Aj B, 
and G are completely independent if -^4 and A and B and G are mde- 
pehdent pair and pair. 

8 If, m the same case of complete equality of contraries, 

(^0)-Ar/4 = 5, 

{BG)-Nl^^l, 

show that 

<i|p] JW. 

SO that the partial associations between A and B in the umveises G and 7 are 
positive or negative according as 

9 In the simple contests of a general election (contests in which one 
Conservative opposed one Liberal and there were no otliei candidates) 66 per 
cent of the winning candidates (according to the returns) spent more money 
than their opponents. Given that 63 per cent of the winners were Con- 
servatives, ang that the Conservative expenditure exceeded the Liberal in 80 
per cent, of the contests, find the peicentages of elections won by Conservatives 
(1) when they spent more and (2) when they spent less than their opponents, 
and hence say whether you consider the above figures evidence of the influence 
of expenditure on election results or no. {Note that if the one candidate in a 
contest be a Conser'cative-winner-wlio spends more than hzs opponent — ^the 
other must necessanly be a Liberal loser^wTio spends less — and so forth. 
Hence the case is one of complete equality of contraries ) 

10 Given that (^)/A^«(5)/Ar=(a)/A’=ar, and that {AB)IN==iAa)/N=y, 
find the major and minor limits to y that enable one to infer positive associa- 
tion between B and Gjie iBO)/N>x^. 

Draw a diagram on squared paper to illustrate your answer, taking x and y 
as co-ordinates, and shading the limits within which y must lie in order to 
permit of the above inference. Point out the peculiarities m the case of in- 
ferring a positive association from two negative associations 

11 Discuss similarly the more complex case {A)IN==Xf {B)fN=2x, {G)IN^ 
3a;: — 

(1) for inferring positive association between B and G given {AB)IN^ 

{AG)IN^y 

(2) for inferring positive association between A and C given {AB)IN=- 

{BOIN^y 

(8) for inferring positive association between A and B given {AG)IN^ 

{BOIN==y. 



CHAPTER V. 


MANIFOLD CLASSIFICATION, 

1. The general principle of a manifold classification— 2-4. The tabic of 
double entry or contingency table and its tieatment by fundamental 
methods~5-8 The coefficient of contingency — 9-10, Analysis of 
acontmgency table by tetrads — 11-13 Isotropic and amsotiopic 
distributions — 14-15 Homogeneity of the classifications dealt with 
in this and the preceding chapters ; heterogeneous classifications. 

1. Classification by dichotomy is, as was briefly pointed out in 
Chap I § 5, a simpler form of classification than usually occurs 
in the tabulation of practical statistics It may bo regarded as 
a special case of a more general form in which tlio individuals or 
objects obseived are first divided under, say, s heads, A-^ ... 

As, each of the classes so obtained then subdivided under t heads, 
B 2 ... Bt, each of these under u heads, Op 0^ . . Oui 

so on, thus giving rise to s t. u .... ultimate classes altogether, 
2. The general theory of such a manifold as distinct from a 
twofold or dichotomous classification, in the case of n attributes 
or characters ABC . . . . iV^, would be extremely complex : in the 
present chapter the discussion will he confined to the case of two 
characters, A and B, only If the classification of the A’s be ,9- 
fold and of the j5's ^-fold, the frequencies of the &t classes of the 
second order may he most simply given by forming a table with 
* s columns headed A^ to A^, and t rows headed to Bt. The 
I number of the objects or individuals possessing any combination 
' of the two characters, say A^ and B^^ i e the frequency of the 
^ ejass A^B^ is entered in the compartment common to the mth 
column and the nih row, the st compartments thus giving all 
the second-order frequencies. The totals at the ends of rows 
and the feet of columns give the first-order frequencies, i e, the 
numbers of A„i’s and ^„’s, and finally the grand total at the 
right-hand bottom corner gives the whole number of observations 
Tables I and II below will serve as illustrations of such tables 
of double-entry or contingency tables, as they have been termed 
by Professor Pearson (ref. 1). 


60 



Y. — MANIFOLD CLASSIFICATION. 


61 


3. In Table I. the division is 3 x 3-fold : the houses in England 
and Wales are divided into those which are m (1) London, (2) 
other urban districts, (3) rural districts, and the houses in each 
of these divisions are again classified into (1) inhabited houses, 
(2) uninhabited but completed houses, (3) houses that are 
“building,^’ in course of erection. Thus from the first row 
we see that there were in London, in round numbers, 616,000 
houses, of which 571,000 were inhabited, 40,000 uninhabited, 
and 5000 in course of erection from the first column, there 
were 6,260,000 inhabited houses in England and Wales, of which 

571.000 were in London, 4,064,000 in other urban districts, and 

1.625.000 in rural districts. 


Table I — Houses %n England and Wales (Census of 1901, 
Summary Table X ) (OOO’s omiited ) 



Inhabited 

Unin- 

habited 

Building 

Total 

Adm County of London • 

571 

40 

5 

616 

Other uiban districts . 

4064 

285 

45 

4394 

Rural districts 

1625 

124 

12 

1761 

Total for England and Wales 

6260 

449 

62 

6771 


In Table II , on the other hand, the classification is 3 x 4-fold 
the eye-colours are classed under the three heads “blue,’^ “grey or 
gieen,” and “biown,’^ while the hair-colours are classed under 
four heads, “fair,^’ “brown,” “black,” and “red.” The table is 


Table II. — Hair- and Eye-Colours of Males in Baden. 
{Ammon^ Zui Anthropologie der Badener,) 


Eye colour 

Hair colour. 

Total 

Fair 

Brown. 

Black. 

Red 

Blue 

1768 

807 

189 

47 

2811 

Grey or Green 

946 

1387 

746 

53 

3132 

Brown .... 

115 

438 

288 

16 

857 

Total 

2829 

2632 

1223 

116 

6800 





62 


THEORY OF STATISTICS. 


read similarly to the last. Taking the first row, it tells us that 
there were 2811 men with blue eyes noted, of whom 1768 had 
fair hair, 807 brown hair, 189 black hair, and 47 red hair. 
Similarly, from the first column, there were 2829 men with fair 
hair, of whom 1768 had blue eyes, 946 giey or green eyes, and 
115 blown eyes The tables are a generalised form of the four- 
fold (2 X 2-fold) tables in § 13, Chap III 

4 For the purpose of discussing the nature of the relation 
between the and the j5’s, any such table may be treated bn 
the principles of the preceding chapters by reducing it in difieient 
ways to 2 X 2-fold form It then becomes possible to trace the 
association between any one or more of the A^b and any one or 
more of the jB% either in the universe at large or m universes 
limited by the omission of one or more of the ^’s, of the JS% or 
of both Taking Table I , for example, trace the association 
between the erection of houses and the urban character of a 
district Adding together the first two rows — z e, pooling London 
and the other urban districts together — and similarly adding the 
first two columns, so as to make no distinction between inhabited 
and uninhabited houses as long as they aie completed, we find — 

Proportion of all houses which | 

are in course of erection m ) 50/5010 «= 10 per thousand, 
urban districts . . ) 

Proportion of all houses which 1 

aie in oomse of erection m > 12/1761 — 7 „ 

ruial districts . . ) 

There IS therefore, as might be expected, a distinct positive 
association, a larger proportion of houses being in course of 
erection m urban than in rural districts 

If, as another illustration, it be desired to trace the association 
' between the “ uninhabitedness ” of houses and the urban character 
i of the district, the procedure will be rather different Rows 1 
‘ and 2 may be added together as before, but column 3 may be 
omitted altogether, as the houses which are only in course of 
erection do not enter into the question. We then have — 

Propoition of all houses which | 

are uninhabited m urban ^ 325/4960 = 66 per thousand, 
districts . 

Proportion of all houses which 
are uninhabited m rural 
districts .... 

The association is therefore negative, the proportion of houses 
iminhabite4 being gi eater in rural than m urban districts. 


124/1749 = 71 



V. — MANIFOLD CLASSIFICATION. 


63 


Tlie eye- and hair-colour data of Table 11. may be treated m a 
precisely similar fashion. If, eg ^ we desire to trace the associa- 
tion between a lack of pigmentation in eyes and m hair, rows 1 
and 2 may be pooled together as representing the least pigmenta- 
tion of the eyes, and columns 2, 3, and 4 may be pooled together 
as representing hair with a more or less marked degree of 
pigmentation. We then have — 


■^Tr ha^r } 2714/5943 = 46 per cent 

Proportion of brown-eyed with ) 115/857 — 13 
fair hair . . J ' ” 


The association is therefore well-marked. For comparison we 
may trace the corresponding association between the most marked 
degree of pigmentation in eyes and hair, %e, brown ejres and 
black hair. Here we must add together rows 1 and 2 as before, 
and columns 1, 2, and 4 — the column for red being really mis- 
placed, as red represents a comparatively slight degree of pigmenta- 
tion. The figures are — 

} 288/857 =34 per cent, 
light-eyed with I 935/5943 ^ jg „ 


The association is again positive and well-marked, but the 
difference between the two percentages is rather less than in the 
last case. 

5 The mode of treatment adopted m the preceding section rests 
on first principles, and, if fully carried out, it gives the most detailed 
information possible with regard to the relations of the two attri- 
butes At the same time a distinct need is felt in practical work for 
some more summary method — a method which will enable a single 
and definite answer to be given to such a question as — Are the 
A^s on the whole distinctly dependent on the .5’s, and if so, is this 
dependence very close, or the reverse! The subject of coefficients 
of association, which affords the answer to this question in the 
case of a dichotomous classification, was only dealt with briefly 
and incidentally, for it is still the subject of some controversy . 
further, where there are only four classes of the second order 
to be considered the matter is not nearly so complex as where 
the number is, say, twenty -five or more, and the need for 
any summary coefficient is not so often nor so keenly felt The 
ideas on which Professor Pearson^s general measure of de- 
pendence, the ‘‘coefficient of contingency,”^ is based, are, more- 
Qvir, quite simple and fundamental, and the mode of calculation 



THEOET OF STATISTICS. 


64 

IS therefore given m full m the following section The advanced 
student shoiild refer to the original memoir (ref 1) for a completer 
treatment of the theory of the coefficient, and of its relation to 
the theory of variables 

6. Generalising slightly the notation of the preceding chapters, 
let the frequency of be denoted by the frequency of 

jB„’s by {B^)i and the frequency of objects or individuals possessing 
both characters by (A^B„) Then, if the A’s and B'b be com- 
pletely independent in the universe at large, we must have for all 
values of m and n — 

. . . ( 1 ) 

If, however, A and ^ aie not completely independent, {A^B„) and 
(A^Bn)^ will not be identical for all values of m and n. Let 
the difference be given by 

= . . . ( 2 ) 

A coefficient such as we are seeking may evidently be based in 
some way on these values of 8. It will not do, however, simply to 
add them together, for the sum of all the values of 8, some of 
which are negative and others positive, must be zero in any case, 
the sum of both the (ABys and the (AB)q^ 8 being equal to the 
whole number of observations iT. It is necessary, therefore, to 
get rid of the signs, and this may be done in two simple ways * (1) 
by neglecting them and forming the arithmetical instead of the 
algebraical sum of the differences 8, or (2) by squaring the differ- 
ences and then summing the squares The first process is the 
shorter, but the second the bettor, as it leads to a coefficient 
easily treated by algebraical methods, which the fiist process 
does not: as the student will see later, squaring is very 
usefully and very frequently employed for the purpose of elimin- 
ating algebraical signs Suppose, then, that every 8 is calculated, 
and also the ratio of its square to the coiiespondmg value of 
(AB)q, and that the sum of all such ratios is, say, ^ or, m 
symbols, using 2 to denote ‘‘ the sum of all quantities like ” : — 

• • • • 

Being the sum of a series of squares, is necessarily positive, 
and if A and B be independent it is zero, because every 8 is zero. 
If, then, we form a coefficient 0 given by the relation 

• • • • ( 4 ) 



V. — MANIFOLD GLASSIFICATION. 


65 


this coefficient is zero if the characters A and B are completely 
independent, and approaches more and more nearly towarci 
unity as ^ increases In general, no sign should be attached 
to the root, for the coefficient simply shows whether the two 
characters are or are not independent, and nothing more, but in 
some cases a conYentional sign may he used. Thus m Table II. 
slight pigmentation of eyes and of hair appear to go together, 
and the contingency may be regarded as definitely positive. If 
slight pigmentation of eyes had been associated with marked 
pigmentation of hair, the contingency might have been regarded 
as negative. G is Professor Pearson’s mean square contingency 
coefficient ^ 

7. The coefficient, in the simple form (4), has one disadvantage, 
viz. that coefficients calculated on different systems of classi- 
fication are not comparable with each other. It is clearly desir- 
able for practical purposes that two coefficients calculated from 
the same data classified in two different ways should be, at least 
approximately, identical With the present coefficient this is not 
the case if certain data be classified m, say, (1) fix 6-fold, (2) 
3 X 3-fold form, the coefficient m the latter form tends to be the 
least. The greatest possible value of the coefficient is, in fact, 
only unity if the number of classes be infinitely great ; for any 
finite number of classes the bmitmg value of G is the smaller the 
smaller the number of classes. This may be briefly illustrated as 
follows Replacing m equation (3) by its value m terms of 
and Anfisa we Lave— 



and therefore, denoting the summation by >S, 

... ( 6 ) 

Now suppose we have to deal with a « x ^-fold classification in 
which for all values of m , and suppose, further, that 

the association between A^^ and is perfect, so that (AJB^ = 
{A^ = {B^ for all values of m, the remaimng frequencies of the 
second order being zero , all the frequency is then concentrated 
in the diagonal compartments of the table, and each contributes 

' Professor Pearson (ref 1) terms 5 a sub-contingency , the square contin- 
gency , the ratio x^/A^, which he denotes by the mean square contingency , 
and the sum of all the 5's of one sign only, on which a diffeient coefficient can 
be based, the mean contingency. 


5 



66 


THEOEY OF STATISTICS. 


If to the sum S. The total value of S is accoidmgly tii, and the 
value of G — 



This IS the greatest possible value of 0 for a symmetrical t x ^fold 
classification, and therefore, m such a tabl^, for — 


2 C cannot exceed 0*707 


3 

t= 4 
5 

^ — 6 
7 

t= 8 
t= 9 
15 = 10 


0 816 
0 866 
0 894 
0 913 
0 926 
0 935 
0 943 
0 949 


It is as well, theiefore, to restrict the use of the “coefficient of 
contingency ” to 5 x 5-fold oi liner classifications At the same 
time the classification must not be made too fine, or else the value 
of the coefficient is laigely affected by casual irrogulantios of no 
physical significance in the class-freuiicncios (r/. the remarks m 
Chap III. §§ 7-8) 


Table III — Independence- Values of the Fi equencies for Table /'4 


Eye colour. 

Pair 

Blown. 

Black 

Eod. 

Blue 

1169 

1088 

606 

48 0 

Grey or Green ..... 

1303 

1212 

663 

03 4 

Brown 

367 

332 

164 

14 G 


8 As the classification of Table II is only 3 x 4-fold, it is rather 
crude for the pm pose of calculating the coefficient, but will servo 
simply as an illustration of the form of the arithmetic In Table 
III are given the values of the independence frequencies, 2829 x 
2811/6800 — 1169 and so on The value of is more readily 
calculated from equation (5) than from (3) . — 




V. — MANIFOLD CLASSIFICATION, 


67 


(1768)2/1169 2673 9 

(946)2/1303 686 8 

(115)2/357 37 0 

(807;2/1088 598 6 

(1387)2/1212 1687 3 

(438)2/332 577 8 

(189)2/506 70 6 

(746)2/563 988 5 

(288)2/154 638 6 

(47)2/48 0 46 0 

(53)2/63 4 52 6 

(16)2/14 6 17 5 


Total = 5= 7875 2 

6800 


1075 2 


The squares in such work may conveniently be taken from 
Barlow’s Tables of SqiLares^ Cubes, etc (see list of tables on 
p 356), or logarithms may be used throughout — five-figure 
logarithms are quite sufificient. 

9 While such a coefiScient of contingency, m some form or 
other, IS a great convenience m many fields of work, its use 
should not lead to a neglect of those details which a treatment by 
the elementary methods of § 4 would have revealed Whether 
the coefificient be calculated or no, every table should always bo 
examined with care to see if it exhibit any apparently significant 
peculiarities in the distribution of frequency, e g. in the associa 
tions subsisting between and m limited universes. A good 
deal of caution must be used in order not to be misled by casual 
irregulaiities due to paucity of observations in some compartments 
of the table, but important points that would otherwise be over- 
looked will often be revealed by such a detailed examination 

10. Suppose, for example, that any four adjacent frequencies, 
say-— 

are extracted from the general contingency table Considering 
these as a table exhibiting the association between and in 
a universe limited to A^Ajj^^^ B^B^+i alone, the association is 
positive, negative, or zero according as {A^BfjjiAjn^iB,) is greater 



68 


THEORT OF STATISTICS. 


than, less than, or equal to the ratio The 

whole of the contingency table can be analysed into a senes of 
elementary groups of four frequencies like the above, each one 
overlapping its neighbours so that an 7 5-fold table contains 
1) (s - 1) such “tetrads,^* and the associations in them all can 
be very quickly determined by simply tabulating the ratios like 

etc, or perhaps better, 
the piopoitions (^,A)/{(A,,A) + etc., for cveiy pair 
of columns or of rows, as may be most convenient Taking the 
figures of 'Table II as an illustiation, and working fiom the 
rows, the proportions run as follows — 


For rows 1 and 2 

176S/27U 0 G51 

807/2194 0 368 

189/935 0 202 

47/100 0 470 


For rows 2 and 3. 
946/1061 0 892 

1387/1825 0 760 

746/1034 0-721 

53/69 0 768 


In both cases the first three ratios form descending series, but 
the fourth latio is gi eater than the second The signs of the 
associations m the six tetiads are aecoidingly — 


+ + 
+ 


The negative sign in the two tetrads on the right is striking, 
the more so as othci tables lor hair- and eye-colour, arranged m 
the same way, exhibit just the same charactcristio But the 
peculiarity will be lemoved at once if the fourth column be placed 
immediately after the first : if this be done, i,e, if “ red bo placotl 
between “fair” and “brown” instead of at the end of the colour- 
senes, the sign of the association m all the elementary tetrads 
will he the same The colours will then run fair, red, brown, 
black, and this would seem to be the more natural order, consider- 
ing the depth of the pigmentation 

11 A distribution of frequency of such a kind that the 
association m every elementary tetrad is of the same sign 
possesses several useful and mteiesting propeitios, as shown m 
the following theorems. It will be termed an isotropic dis- 
tribution. 

(1) In an isotropic distribution the sign of the association is 
the same not only for every elementary tetrad of adjacent frequen- 
cies ^ hut for every set of four frequencies in the compartments 
common to two rows and two columns^ e.g. 

(•^wi+jp'^n+g) 



V, — JVIANIFOLD CLASSIFICATION 


69 


For suppose that the sign of association in the elementary 
tetrads is positive, so that — 

• ( 1 ) 

and similarly, 

("^w+2'^n)(‘^rn+l-^n+l) • • (^) 

Then multiplying up and cancelling we have 

That IS to say, the association is still positive though the two 
columns A^n and ^,^+2 longer adjacent 

(2) An isotropic distribution remains isotropic in whatever way 
it inay he condensed by grouping together adjacent rows or columns. 

Thus from (1) and (3) we have, adding — 

+ (-4„i+2^n+l)] > (-^»n-^w+l)[{'^7n+l-^n) + {A^^^B^\ 

that IS to say, the sign of the elemental y association is unaffected 
by throwing the (m+ l)th and (m + 2)th columns into one. 

(3) As the extreme case of the preceding theorem, we may 
suppose both rows and columns grouped and legrouped until 
only a 2 X 2-fold table is left , we then have the theorem — 

If an isotropic distribution he reduced to a fourfold distribution 
in any way whatever^ by addition of adjacent i ows and columns^ 
the sign of the association in such fourfold table is the same as in 
the elementary tetrads of the original table. 

The case of complete independence is a special case of isotropy. 
For if 

for all values of m and ri, the association is evidently zero for 
every tetrad Theiefoie the distribution remains independent 
m whatever way the table be giouped, or in whatever way the 
universe be limited by the omission of rows or columns. The 
expi ession “ complete independence ” is therefore justified 

From the work of the preceding section we may say that Table 
II is not isotropic as it stands, but may be regarded as a dis- 
arrangement of an isotropic distribution It is best to reairange 
such a table m isotropic order, as otherwise different reductions 
to fourfold form may lead to associations of different sign, though 
of course they need not necessarily do so. 

12 The following will serve as an illustration of a table that 
is not isotropic, and cannot be rendered isotropic by any rearrange- 
ment of the order of rows and columns 



70 


THEORY OF STATISTICS. 


Table IV —Showing the Frequencies of ent ComUnatioTis of 
Eye toloms in Father and Son 

(Data of Sir F Galton, from Karl Pearson, Phil Tiarn, , A, vol cxcv 
(1900), p 138 , classitication condensed ) 

1 Blue. 2 Blue green, grey 3 Dark grey, hazel 4 Brown 


Fa I iiLii’s EYE-ooLOtru. 



1. 

2 

3. 

4. 

Total 

1 

194 

70 

41 

30 

335 

2 

83 

121 

41 

36 

284 

3 

25 

34 

55 

23 

137 

4 

66 

36 

43 

109 

244 

Total 

358 

264 

180 

198 

1000 


The following are the ratios of the frequency in column m to 
the sum of the fioquencies in columns 771 and wi-h 1 — 


1 and 2. 

Columns 

2 and 3. 

8 and 4. 

0 735 

0 631 

0 577 

0 401 

0 752 

0-532 

0 424 

0 382 

0-705 

0 609 

0 456 

0 283 


The order in which the ratios lun is difTorent for each pair of 
columns, and it is accordingly impossible to make the table 
isotropic The distribution of signs of association in the seveial 
tetrads is — 

+ « 4. 

- + - 

~ - + 

The distribution is a curious one, the associations in tetrads 
round the diagonal of the whole table being so markedly positive 
and those in the immediately adjacent tetrads equally markedly 
negative. Neglecting the other signs, this is the effect that 
would be produced by taking an isotropic distribution and then 
increasing the frequencies in the diagonal compartments by a 
sufficient percentage. Comparison of the given table with others 
from the same source shows that the peculiarity is common to 




T. — MANIFOLD GLASSIFICATION. 


71 


the great majority of the tables, and accordingly its origin 
demands explanation Were such a table treated by the method 
of the contingency coefficient, or a similar summary method, 
alone, the peculiarity might not be remarked 

13 It may be noted, in concluding this part of the subject, 
that in the case of complete independence the distribution of 
frequency m every row is similar to the distribution in the row 
of totals, and the distribution in every column similar to that in 
the column of totals , for in, say, the column the frequencies 
are given by the relations — 

(A„B,}J^(B)„ 

and so on. This property is of special importance in the theory 
of variables 

14 The classifications both of this and of the preceding chapters 
have one important characteristic in common, viz that they 
are, so to speak, ‘‘homogeneous’’ — ^the principle of division 
being the same for all the sub-classes of any one class Thus 
-4’s and a’s are both subdivided into ^’s and ^’s, .d^’s, -dg’® • • ' * 
A/s into jBqS .... jS/s, and so on. Clearly this is necessary 
in order to render possible those comparisons on which the 
discussions of associations and contingencies depend If we 
only know that amongst the -i’s there is a certain percentage 
of jS% and amongst the a’s a certain percentage of (7s, there 
are no data for any conclusion 

Many classifications are, however, essentially of a heterogeneous 
character, e,g biological classifications into orders, genera, and 
species, the classifications of the causes of death m vital 
statistics, and of occupations in the census To take the last 
case as an illustration, the first “ order ” in the list of occupations 
IS “General or Local Government of the Country,” subdivided 
under the headings (1) National Government, (2) Local Govern- 
ment The next order is “ Defence of the Country,” with the sub- 
headings (1) Army, (2) Navy and Marines — not (1) National 
and (2) Local Government again — the sub-heads are necessarily 
distinct Similarly, the third order is “ Professional Occupations 
and their Subordinate Services,” with the fresh sub-heads (1) 
Clerical, (2) Legal, (3) Medical, (4) Teachmg, (5) Literary and 
Scientific, (6) Engineers and Surveyors, (7) Art, Music, Drama, 
(8) Exhibitions, Games, etc The number of sub-heads under 
each main heading is, m such a case, arbitrary and variable, 
and different for each mam heading , but so long as the 
classification remains purely heterogeneous, however complex 



72 


THEOKY OF STATISTICS 


it may become, there is no opportuniiy for any discussion 
of causation within the limits of the matter so deiived. It is 
only when a homogeneous division is m some way introduced 
that we can begin to speak of associations and contingencies. 

15. This may be done m various ways according to the 
nature of the case Thus the relative fiequencies of different 
botanical families, genei’a, or species may be discussed in 
connection with the topographical chaiacters of their habitats — 
desert, marsh, or moor — and we may observe statistical assocfia- 
tions between given genera and situations of a given topographical 
type. The causes of death may be classified according to sex, 
or age, or occupation, and it then becomes possible to discuss 
the association of a given cause of death with one or othei 
of the two sexes, with a given age-group, or with a given 
occupation Again, the classifications of deaths and of occupations 
are repeated at successive intervals of time ; and if they have 
remained strictly the same, it is also possible to discuss the 
association of a given occupation or a given cause of death with 
the earlier or later year of obsoivation — % e to see whethei the 
numbers of those engaged m the given occupation or succumbing 
to the given cause of death have increased or decreased But 
in such circumstances the greatest care must bo taken to see 
that the necessary condition as to the identity of the classifications 
at the two periods is fulfilled, and unfortunately it very 
seldom is fulfilled All practical schemes of classification are 
subject to alteration and impiovement from time to time, and 
these alterations, however desirable in themselves, render a 
certain number of comparisons impossible. Even where a 
classification has remained veibally the same, it is not necessarily 
really the same , thus, m the case of the causes of death, 
improved methods of diagnosis may transfer many deaths from 
one heading to another without any change in the incidence 
of the disease, and so bring about a virtual change m the 
classification In any case, heterogeneous classification should 
be regarded only as a partial process, incomplete until a 
homogeneous division is introduced either directly or indirectly, 
e g, by repetition. 


REFEREMOES. 

Contingency 

(1) Pearsojst, Karl, “On the Theory of Contingency and its Relation to 
Association and Noimal Correlation,” Drapers^ Coinpany liestarch 
Memoirs, Biometnc Series x , Dnlau & Co., London, 1904 (The 
memoir in which the coefficient of contingency is proposed. ) 



Y.— MANIFOLD CLASSIFICATION. 


73 


(2) LiPPS, G F, “Die Bestimmung der Abhangigkeit zwischen den 

Meikmalen ernes Gegenstandes,” BerteMe der math -phys Klasse der 
hgl, Sacliswchen Gesellschaft der Wzssenschaften , Leipzig, 1905 (A 
general discussion of the problems of association and contingency ) 

(3) Peaeson, Karl, “ On a Coefficient of Class Heterogeneity or Divergence,"' 

Biometrika^ vol v. p 198, 1906 (An application of the contingency 
coefficient to the measurement of heterogeneity, eg in different 
districts of a country, by treating the observed frequencies of some 
quality A^, Ag . An in the different districts as rows of a con- 
^ tmgency table and working out the coefficient the same principle is 
also applicable to the comparison of a single district with the rest of 
the country ) 

Isotropy. 

(4) Yule, G U , “On a Property which holds good for all Groupings of a 

Normal Distribution of Frequency for Two Variables, with applications 
to the Study of Contingency Tables for the Inheritance of Unmeasured 
Qualities,” Proc Roy Soc , Series A, vol Ixxvii,, 1906, p 324 (On 
the property of isotropy and some applications ) 

(5) Ytjle, G U , “On the Influence of Bias and of Personal Equation in 

Statistics of Ill-defined Qualities,” Jour Anthrop Inst , vol xxxvi , 
1906, X) 325 (Includes an investigation as to the influence of bias 
and of personal equation m creating divergences from isotropy in 
contingency tables ) 

Contingency Tables of two Rows only 

(6) Pearson, Karl, “On a New Method of Determining Correlation between 

a Measuied Character A and a Character B of which only the Percentage 
of Cases wherein B exceeds (or falls short of) a given Intensity is recorded 
foi each Grade of Biometrika, vol vii , 1909, p. 96 (Deals with a 
measure of dependence for a common type of table, e.g o. table showing 
the numbers of candidates who passed or failed at an examination, for 
each year of age The table of such a type stands between the con- 
tingency tables for unmeasured characters and the correlation table 
(chaii ix ) for vanables Pearson’s method is based on that adopted 
for the correlation table, and assumes a normal distribution of fre- 
quency (chap. XV ) for .5 ) 

(7) Pearson, Karl, “On a New Method of Detemumng Correlation, when 

one Variable is given by Alternative and the other by Multiple 
Categories,” Biometrilaj vol. vii , 1910, p 248 (The similar 
problem for the case m which the variable is replaced by an un- 
measured quality.) 


EXERCISES. 

(1) (Data from Karl Pearson, “ On the Inheritance of the Mental and Moral 
Characters m Man,” Jour of the Anihrop Inst , vol xxxiii , and Biomeirlkaf 
vol 111 ) Find the coefficient of contingency (coefficient of mean square 
contingency) for the two tables below, showing the resemblance between 
brothers for athletic capacity and between sisters for temper. Show that 
neither table is even remotely isotropic (As stated in § 7, the coefficient of 
contingency should not as a rule be used for tables smaller than 5 x 6-fold r 
these small tables are given to illustrate the method, while avoiding lengthy 
arithmetic ) 



Second Sister. Second Brother 


74 


THEORY OF STATISTICS. 


A. Athletic Capacity 
First Brother. 



Athletic 

Betwixt 

Non- 

athletic. 

Total 

Athletic 

906 

20 

140 

1066 

Betwixt 

20 

76 

9 

1j05 

Non-athletic . . 

140 

9 

370 

619 

Total 

1066 

105 

619 

1690 


B. Temper. 
First Sister. 



Quick 

Good- 

natuiod 

Sullen 

Total, 



Quick . . . 

198 

177 

77 

^52 

Grood-natuied 

177 

996 

165 

1338 

Sullen 

77 

165 

120 

362 

Total 

452 

1338 

362 

2152 





PART IL— THE THEORY OF VARIABLES. 


CHAPTEK VI 

THE FEEQUENOT-DISTEIBUTIOH. 

1 Introductory — 2. Necessity for classification of observations the frequency 
distnbution — 3 Illustrations — 4 Method of forming the table — 5. 
Magnitude of class-interval — 6 Position of intervals— 7. Process of 
classification — 8. Treatment of intermediate observations— 9. Tabula- 
tion — 10 Tables with unequal intervals — 11. Graphical repiesenta- 
tion of the frequency-distiibution— 12. Ideal fiequency-distnbutions 
— 13 The symmetrical distnbution — 14, The moderately asymmetri- 
cal distribution — 15 The extiemely asymmetrical or J -shaped dis- 
tribution — 16. The XT-shaped distribution. 

1 The methods described m Chaps I -Y. are applicable to all 
observations, whether qualitative or quantitative, we have now 
to proceed to the consideration of specialised processes, definitely 
adapted to the treatment of quantitative measurements, but not 
as a rule available (with some important exceptions, as suggested 
by Chap I § 2) for the discussion of purely qualitative observa- 
tions Since numerical measurement is applied only in the case 
of a quantity that can present more than one numerical value, 
that IS, a varying quantity, or more shortly a variable, this section 
of the work may be termed the theory of variables. As common 
examples of such variables that are subject to statistical treat- 
ment may be cited birth- or death-rates, prices, wages, barometer 
readings, rainfall records, and measurements or enumerations {e g 
of glands, spines, or petals) on ammals or plants. 

2. If some hundreds or thousands of values of a variable have 
been noted merely in the arbitrary order in which they happened 
to occur, the mind cannot properly grasp the significance of the 
record the observations must be ranked or classified in some 
way before the characteristics of the series can be comprehended, 
and those comparisons, on which arguments as to causation 
depend, can be made with other series The dichotomous classi- 

75 



76 


THEORY OF STATISTICS. 


fication, considered in Chaps L-IV., is too crude if the values aiNa 
merely classified as A^s or a’s according as they exceed or fall 
short of some fixed value, a large part of the information given 
by the original record is lost. A manifold classification, however 
(cf. Chap. V ), avoids the crudity of the dichotomous form, since 
the classes may be made as numerous as we please, and numerical 
measurements lend themselves with peculiar readiness to a 
manifold classification, for the class limits can be conveniently 
and precisely defined by assigned values of the variable 'For 
convenience, the values of the variable chosen to define the 
successive classes should be equidistant, so that the numbers of 
observations m the different classes (the class-frequencies) may be 
comparable Thus for measurements of stature the interval 
chosen for classifying (the class-interval, as it may be termed) 
might be 1 inch, or 2 centimetres, the numbeis of individuals 
being counted whose statures fall within each successive inch, or 
each successive 2 centimetres, of the scale , returns of birth- or 
death-rates might be giouped to the nearest unit per thousand 
of the population; returns of wages might be classified to the 
nearest shilling, or, if desired to obtain a moie condensed table, 
by intervals of five shillings or ten shillings, and so on When 
the variation is discontinuous, as for example m enumerations 
•of numbers of children m families or of petals on flowers, the 
unit is naturally taken as the class-interval unless the range of 
variation is very gieat. The manner m whicli the observations 
are distributed over the successive equal intervals of the scale is 
spoken of as the frequency-distribution of the variable 
3. A few illustrations will make clearer the nature of such 
frequency-distributions, and the service which they render in 
summarising a long and complex record : — 

(a) Table I. In this illustration the mean annual death-rates, 
expressed as proportions per thousand of the population per 
annum, of the 632 registration districts of England and Wales, 
for the decade 1881-90, have been classified to the nearest unit ; 
ie the numbers of districts have been counted in which the 
death-rate was over 12 5 but under 13 6, over 13 '5 but under 
14 '5, and so on The frequency-distribution is shown by the 
following table. 


[Table I 



VI — THE FEEQUBNCY-DISTBIBUTION. 


77 


Table I . — Shotoing the Numbers of Begistration Districts in Erqlmd and 
Wales with Different mean Death-iaies per Thousand of the PopvZation 
per Annum for the Ten Years 1881-90 (Material fiom the Supplement 
to the hhth Annual Report of the Registrar-General for England and 
7769J1895 ) 


Mean Annual 
Death-rate 

ITumber of 
Distiicts with 
Death-rate 
between Limits 
stated 

Mean Annual 
Death-rate 

Knmber of 
Districts with 
Death-rate 
between Limits 
stated 

12*5-13 5 

5 

23 5-24-6 

5 

13 5-14 5 

16 

24 5-25*6 

3 

14 5-15 5 

61 

25 5-26 5 

1 

15 6-16 5 

112 

26 5-27*5 

1 

16*5-17 6 

159 

27 5-28*5 

2 

17 5-18 5 

104 

28 5-29*5 


18 5-19 5 

67 

29 5-30*5 


19 5-20 5 

42 

30 5-31 5 

*2 

20 5-21 5 

25 

31 5-32 5 i 


21 5-22 5 

18 

32 5-33 5 

1 

22 5-23 5 

8 

Total 

632 


Whilst a glance thiough the original returns fails to convey 
any very definite impression, owing to the laige and erratic 
differences between the death-rates in successive districts, a brief 
inspection of the above table brings out a number of important 
points. Thus we see that the death-rates range, in round 
numbers, from 13 to 33 per thousand per annum, but in the 
great majority of districts lie nearer the lower limit than the 
upper ; that the death-rates in some 60 per cent, of the districts 
lie within the nanow limits 15 5 to 18 5, the rates being most 
frequent near 17 per thousand, and so forth 

(h) Table 11. The ages at death, in years, of the married 
women m ceitam Quaker families were lecoided and classified in 
5-year groups according as they were over 17 5 but under 22 5, 
over 22 5 but under 27*5, and so on. The frequency-distribution 
was as follows : — 


[Table IT 




78 THEORY OF STATISTICS. 

Table II — Showing the Nun^Jbers of Married JFomen, %n certain Quaker 
Families, Dying at Different Ages (Cited from Proc Roy Soc. , vol Ixvu. 
(1900), p 172 On the Correlation between Duration of Life ana Numoei 
of Offspiing, by Miss M Beeton, KaiI Pgaisou, and G. U. Yule ) 


Age at Death, 
Years 

Number of 
Women Dying 
between 
said Yeais 
of Age 

Age at Death, 
Years 

Number of 
Women Dying 
between 
said Years 
of Age 

17 5-22 5 

29 

62*6- 67 5 

73 

22 5-27 5 

87 

67-5- 72 5 

83 

27 5-32 5 

99 

72 5- 77 5 

77 

32 5-37 5 

109 

77 5- 82 5 

1 

37 6-42 5 

90 

82 6- 87 5 

59 

42 6-47*5 

87 

87 5- 92 5 

26 

47 6-52 6 

64 

92 5- 97 5 

7 

52 5-57 5 

54 

97 5-102 6 

4 

57 5-62 5 

69 

Total 

1095 


The distribution is somewhat more irregular than in the last 
case^ the cotnmenceinent is abrupt, a maximum frequency is 
attained in the fourth class (age at death J32 5 to 37 5), and then 
there is a slow fall to the age-class 52 5-57 5 After this class 
the fiequency uses again and attains a secondaiy maximum m 
the age-class 67 5-72 5 

(c) Table III The numbers of stigmatic rays on a number 
of Shirley poppies were counted As the range of variation is 
not great, the unit is taken as the class-interval The frequency- 
distribution IS given by the following tabic 

Table III — Showing the Frequencies of Seed Capsules on certain Shirley 
Poppies, with Different Numbers of Stigmatiu Rays, (Gitod from 
JBiometiiTca, n. p. 89, 1902 ) 


Number of 
Stigrnatic 
Rays 

Number of 
Capsules 
with said 
Number of 
Stigmatic Rays, 

Numboi of 
Stigmatic 
Rays, 

Number of 
Capsules 
■with said 
Numbei of 
Stigmatic Hays 

6 

3 

14 

302 

7 

11 

15 

234 

8 

38 

16 

128 

9 

106 

17 

60 

10 

i:)2 

18 

19 

11 

238 

19 

3 

12 

305 

20 

1 

13 

315 

Total 

1905 




VI — THE FREQUENCY-DISTRIBUTION. 79 

The numbers of rays range from 6 to 20, — 12, 13, or 14 rays 
being the most usual 

4 To expand slightly the brief description giyen m § 2, tables 
like the preceding are formed in the following way ^1) The 
magnitude of the class-interval, z e the number of units to ^ch 
interval, is first fiixed ; one unit was chosen m the case of Tables 
I and III , five units in the case of Table II The position or 
origin of the intervals must then be determined, e ^ in Table I 
we must decide whether to take as intervals 12-13, 13-14, 14-15, 
etc , or 12*5-13 5, 13 5-14 5, 14 5-15 5, etc (3) This choice 
having been made, the complete scale of intervals is fixed, and the 
observations are classified accordingly. (4) The process of 
classification being finished, a table is drawn up on the general 
lines of Tables I -III , showing the total numbers of observations 
in each class-interval Some remarks may be made on each of 
these heads 

5 Magnitude of ClassMntervaL — As already remarked, in cases 
where the vaiiation proceeds by discrete steps of considerable 
magnitude as compaied with the range of variation, there is very 
little choice as regards the magnitude of the class-interval The 
unit will m general have to serve. But if the variation be con- 
*Hhuous, or at least take place by discrete steps which are small 
in comparison with the whole lange of variation, there is no such 
natural class-interval, and its choice is a matter for judgment 

The two conditions which guide the choice are these . (a) we 
desire to be able to treat all the values assigned to any one class, 
without serious error, as if they were equal to the mid-value 
of the class-interval, eg as if the death-rate of every district in 
the first class of Table I were exactly 13 0, the death-rate of 
every district in the second class 14 0, and so on, ^^for con- 
venience and brevity we desire to make the interval as large as 
possible, subject to the first condition. These conditions will 
generally be fulfilled if the interval be so chosen that the whole 
number of classes lies between 15 and 25 A number of classes 
less than, say, ten leads in general to very appreciable inaccuracy, 
and a number over, say, thirty makes a somewhat unwieldy 
table A preliminary inspection of the record should accordingly^ 
be made and the highest and lowest values be picked out 
Dividing the difference between these by, say, five and twenty, we 
have an approximate value for the inteival The actual value 
should be the nearest integer or simple fraction. 

6 Position of Intervals — The position or starting-point of the 
intervals is, as a rule, more or less indifferent, but in general it 
IS fixed either so that the limits of intervals are integers, or, as in 
Tables I. and II , so that the mid-values are integers It may, 



80 


THEOEY OF STATISTICS, 


however, be chosen, for simplicity in classification, so that no 
limit corresponds exactly to any recorded value (c/ ^ 8 below). In 
some exceptional cases, moreover, the obsei vations exhibit a maiked 
clustering round certain values, tons, or tens and fives. This 
IS generally the case, for instance, in age letnrns, owing to the 
tendency to state a round number where the true age is unknown 
Under such circumstances, the values round which there is a 
marked tendency to cluster should preferably be made mid- values 
of intervals, in order to avoid sensible error m the assumption that 
the mid-value is approximately representative of the values m the 
class. Thus, in the case of ages, since the clustering is chiefly round 
tens, “ 25 and under 35,” “ 35 and under 45,” etc , the classification 
of the English census, is a better grouping than “ 20 and under 
30,” “30 and under 40,” and so on (c/ the Genrn^ of England and 
Wales, 1911, vol vii , and also ref 5, in which a different view is 
taken). When there is any probability of a clustering of this kind 
occurring, it is as well to subject the raw material to a close 
examination before finally fixing the classification 

7. Classification . — The scale of intervals having been fixed, the 
observations may be classified If the number of observations is 
not large, it will be sufficient to mark the limits of successive 
intervals in a column down the left-hand side of a sheet of paper, 
and transfer the entiies of the original record to this sheet by 
marking a 1 on the line corresponduig to any class for each entry 
assigned thereto It save.s time m Biibso({uout totalling if each 
fifth entiy in a class is maiked by a diagonal acioss the preceding 
four, or by leaving a space 

The disadvantage m this process is that it offers no facilities for 
checking* if a repetition of the classification leads to a different 
result, there is no means of tiacing tho eiror. If the number of 
observations is at all considerable and accuiacy is essential, it is 
accordingly better to enter the values observed on cards, one to 
each observation. These are then dealt out into packs according 
to their classes, and the whole work checked by running through 
the pack corresponding to each class, and verifying that no cards 
have been wrongly sorted 

8. In some cases difficulties may arise in classifying, owing to 
the occurrence of observed values corresponding to class-limits 
Thus, lu compiling Table I , some districts will have been noted 
with death-rates entered in the Eegistrar-General’s returns as 
16*5, 17 5, or 18 5, any one of which might at first sight have 
been apparently assigned indifferently to either of two adjacent 
classes. In such a case, however, where the original figures for 
numbers of deaths and population are available, the difficulty may 
be readily surmounted by working out the rate to another place 



VI. — THE EKEQUENCY-DISTEIBUTION. 


81 


of decimals if the rate stated to be 16 50 proves to be 16 502, it | 
will be sorted to the class 16 5-17 5 j if 16*498, to the class 
15 5-16 5 Death-rates that work out to half-units exactly do^ 
not occur in this example, and so there is no real difficulty. In 
the case of Table II , again, there is no difficulty * if the year of 
birth and death alone are given, the age at death is only calcul- 
able to the nearest umt ; if the actual day of birth and death he 
cited, half-years still cannot occur in the age at death, because 
there is an odd number of days in the year. The difficulty may 
always be avoided if it be home m mind in fixing the limits 
to class-intervals, these being earned to a further place of decimals, 
or a smaller fraction, than the values m the original record Thus 
if statures are measured to the nearest centimetie, the class- 
intervals may be taken as 150 5-151 5, 151*5-152*5, etc ; if to 
the nearest eighth of an inch, the intervals may be 59^^60-|^, 
60^1-61-]^, and so on. 

If the difficulty is not evaded in any of these ways, it is* 
usual to assign one-half of an intermediate observation to each 
adjacent class, with the result that half-units occur in the 
class-frequencies (cf Tables VII , p. 90, X , p 96, and XI , 
p 96) The procedure is rough, but probably good enough for 
practical purposes ; strict precision is usually unattainable, for in 
point of fact the odd way m which different individuals read a 
scale {cj Supplement I ) rendeis it impossible to assign exact 
limits to intervals 

9. Tabulation — As regards the actual drafting of the final 
table, 1:here i^ little to be said, except that care should be taken 
to express the class-limits clearly, and, if necessary, to state the 
manner in which the difficulty of intermediate values has been 
met or evaded The class-limits are perhaps best given as in^ 
Tables I and II , but may be more briefly indicated by the mid-^ 
values of the class-mteivals Thus Table I. might have been 
given in the form — 


Death-rate per 1000 
per annum to the 
Nearest Unit 

13 

14 

15 

16 
etc 


Number of 
Districts with 
said Death-rate. 

5 

16 

61 

112 

etc 


A common mode of defining the class-intervals is to state the 
limits m the form “ x and less than y ” In the case of measure- 
ments of stature, for example, the table might run — 


6 



82 


THEORY OP STATISTICS. 


Stature in Inches 

57 and less than 58 

58 „ „ 59 

59 „ „ 60 
etc 


I^umber of 
Observations. 

2 

4 

14 

etc. 


—the statement ‘‘ 57 and less than 58,” etc,, being often abbreviated 
to 57-, 58-, 59-, etc (c/ Table VI , p 88) The mode of gxoupmg 
is, m effect, that described m the last paragraph as of service in 
avoiding intermediate observations, but it should be noted that the 
form of statement leaves the class-limits uncertain unless the degree 
of accuracy of the measurements is also given. Thus, if measure- 
ments were taken to the nearest eighth of an inch, the class- 
limits are really 56-^57^1, 57^58^, etc.; if they were 
only taken to the nearest quarter of an inch, the limits are SG-J 
-57|, 57|-58|-, etc. With such a form of tabulation a state- 
ment as to the number of significant figures in the original 
record is therefore essential. It is better, perhaps, to state the 
true class-limits and avoid ambiguity. 

10 The rule that class-intervals should be all equal is one 
that IS very frequently broken in official statistical publications, 
principally in order to condense an otherwise unwieldy table, 
thus not only saving space in printing but also considerable 
expense in compilation, or possibly, in the case of confidential 
figures, to avoid giving a class which would contain only one or 
two observations, the identity of which might be guessed. It 
would hardly be legitimate, for example, to give a return of 
incomes relating to a limited district m such a form that the 
income of the two or three wealthiest men in the distiiot would 
be clear to any intelligent reader with local knowledge. If the 
intervals be made unequal, the application of many statistical 
methods is rendered awkward, or even impossible, and the 
relative values of the fiequencies are at first sight misleading, so 
that the table is not perspicuous Thus, consider the fiist two 
columns of Table IV , showing the numbers of dwelling-houses 
of different annual values, assessed to inhabited house duty. On 
running the eye down the column headed number of houses ” it 
IS at once caught by the two striking ii regularities at the classes 
and under £80,” and “£100 and under £150.” But these 
have no real significance , they are merely due to changes from 
a £10 to a £20, and then to a £50 interval. Moreover, the 
intervals after £150 go on continuously increasing, but attention 
is not directed thereto by any marked changes in the fiequencies 
To make the latter really comparable %nteT se, they must first be 



VI. — ^THE FKEQUENCY-DISTEIBUTION. 


83 


Table IV — Showing the Annual Value and Numher of DwelUng-houses in 
Great Britain assessed to Inhabited House Duty in 1885-6. (Cited fiom 
Jour, Boy Slot Soc , vol 1 , 1887, p. 610.) 


Annual Value in £’3 

Number 
of Houses 

Frequency 
per £10 
Interval 

£20 and under £30 
30 „ 40 

40 „ 60 

50 „ 60 

60 „ 80 

80 „ 100 

100 „ 150 

150 „ 300 

300 „ 500 

500 ,, 1000 

1000 and upwards 

Total number of houses 

306,408 

182,972 

105,407 

63,096 

71,436 

32,365 

41,336 

26,732 

6,198 

2,098 

644 

306,408 
182,972 
105,407 
63,096 
i 35,718 
i 16,182 
8,267 
' 1,782 

i 310 

42 

1 

838,692 

— 


reduced to a common interval as basis, eg, £10, by dividing the 
fifth and sixth numbers by 2, the seventh by 5, the eighth by 15, 
and so on This gives the mean frequencies per £10 interval 
tabulated in the third column of Table IV The reduction is, 
however, impossible in the case of the last class, for we are only 
told the number of houses of £1000 annual value and upwards 
the magnitude of the class is indefinite Such an indefinite class 
is in many respects a great inconvenience, and should always be 
avoided in work not subject to the necessary limitations of 
official publications 

The general rule that intervals should be equal must not be 
held to bar the analysis by smaller equal intervals of some 
portion of the range over which the frequency varies very 
rapidly In Table XII , p 98, for example, giving the numbers 
of deaths from diphtheria at successive ages, a five-year interval 
might be substituted with advantage for the irregular intervals 
after the fifth year of age, but it would still be desirable to give 
the numbers of deaths in each year for the first five years, so as 
to bring out the rapid rise to the maximum in the fourth year 
of life 

11. When the table has been completed, it is often convenient 
to represent the frequency-distribution by means of a diagram 
which conveys the general run of the observations to the eye 
better than a column of figuies The following short table, 




84 


THEOET OF STATISTICS. 


gmng the distribution of head-breadths for 1000 men, will serve 
as an example 

Table Y Showing the Frequency-distribution of JSead headths for Students 
at Cambridge Measurements talcen to the nearest tenth oj an inch, 
(Cited fiom W R Macdonell, Biometiila^ i., 1902, p 220.) 


Head -breadth 
in Inches. 

Numbei of 
Men with said 
Head breadth 

Head bieadtli 
m Inches 

Number of 
Men with said 
Head bieadth. 

5 5 

3 

6 3 

99 

5 6 

12 

6 4 

37 

5 7 

43 

6 5 

15 

6 8 

80 

6*6 

12 

5 9 

131 

6 7 

3 

6 0 

236 

6 8 

2 

6 1 

185 



6 2 

142 

Total 

1000 


Taking a piece of squared paper ruled, say, m inches and tenths, 
mark off along a horizontal base-line a scale representing class- 
intervals I a half-inch to the class-interval would be suitable. 
Then choose a vertical scale for the class-frequencies, say 50 
observations per interval to the inch, and mark off, on the 
verticals or ordinates through the points marked 5 *5, 5 ’6, 6 7 
at the centres of the class mteivals on the base-hne, heights 
representing on this scale the class-ficquencieb 3, 12, 43. . . . 
The diagram may then be completed m one of two ways: (1) 
as a frequency-polygon, by joining up the rnaiks on the vei- 
ticals by straight lines, the last points at each end being joined 
down to the base at the centre of the next class-interval (fig. 1) , 
or (2) as a column diagram or histogram (to use a term sug- 
gested by Professor Pearson, ref 1), short hoiizontals being drawn 
through the marks on the verticals (fig 2), which now form the 
central axes of a series of rectangles representing the class- 
frequencies The student should note that in any such diagram, 
of either form, a certain area represents a given number of 
observations On the scales sug^?ested, 1 inch on the hoiizontal 
represents 2 intervals, and 1 inch on the vertical represents 60 
observations per interval 1 square inch therefore represents 
50x2 = 100 observations The diagrams are, however, con- 

ventional the whole area of the figure is correct in either case, 
but the area over each interval is not correct in the case of the 
frequency-polygon, and the frequency of each fraction of any 




YI. — THE FKEQUENCY-DISTRIBUTION 


86 



Fig 1. — Frequency-Polygon for Head-breadths of 1000 Cambridge 
Students (Table V.) 



Fig, 2, — Histogram for the same data as Fig 1. 



86 


THEOKT OF STATISTICS. 


interval is not the same, as suggested by the histogram. The 
area shown by the frequency-polygon over any intoival with an 
ordinate (fig. 3 ) is only correct if the tops of the three 



successive ordinates yg* 2^3^^® ^ line, te. if f/g” 

the areas of the two little triangles shaded in the figure bemg 

equal. If fall short of this value, the area shown by the 



Fig. 4. 

polygon is too great; if y^ exceed it, the area shown by the 
polygon is too small, and if, foi this reason, the frequency- 
polygon tends to become very misleading at any part of the 
range, it is better to use the histogram In the mortality dis- 
tribution of Table I , for instance, the frequency rises so sharply 



YL — ^THE EEEQUKNCY-BISTRIBUTION 


87 


to the masimiim that a histogram is, on the whole, the better re- 
presentation of the distribution of frequency, and in such a 
distribution as that of Table IV. the use of the histogram is 
almost imperative 

12 If the class-interval be made smaller and smaller, and at 

the same time the number of observations be proportionately in- 
creased, so that the class-frequencies may remain finite, the 
polygon and the histogram will approach more and more closely 
to a smooth curve Such an ideal limit to the frequency-polygon 
or histogram is termed a frequency-curve. In this ideal frequency- 
curve the area between any two ordinates whatever is strictly 
proportional to the number of observations falling between the 
corresponding values of the variable Thus the number of 
observations falling between the values and of the variable 
m fig 4 will be proportional to the area of the shaded strip in the 
figure; the number of observed values greater than will 
similarly be given by the area of the curve to the right of the 
ordinate through ojg, and so on When, in any actual case, the 
number of observations is considerable — say a thousand at least 
— the run of the class-frequencies is generally sufficiently 
smooth to give a good notion of the form of the ideal distri- 
bution, with small numbers the frequencies may present all 
kinds of irregulaiities, which, most probably, have very little 
significance (c/. Chap XV. § 15, and § 18, Ex iv.). The forms 
presented by smoothly running sets of numerous observations 
present an almost endless variety, but amongst these we notice 
a small number of comparatively simple types, from which many 
at least of the more complex distributions may be conceived as 
compounded. For elementary purposes it is sufficient to consider; 
these fundamental simple types as four m number, the symmetri- i 
cal distribSion' " the moderately asymmetrical distribution,^ the ' 
"^remely asymmetrical or J-shaped distribution, and the U-shaped 
distribution < 

13 The symTnetrical distribution^ the class-frequencies decreas- 
ing to zero symmetrically on either side of a central maximum. 
Fig 5 illustrates the ideal form of the distribution. 

Being a special case of the more general type described under 
the second heading, this form of distiibution is comparatively rare 
under any circumstances, and very exceptional indeed in economic 
statistics It occurs more frequently m the case of biometric, more , 
especially anthropometric, measurements, from which the following ' 
illustrations are drawn, and is important in much theoretical work . 
Table VI shows the frequency-distiibution of statures for adult 
males in the British Isles, from data published by a British 
Association Committee in 1883, the figures being given separately 



88 


THEORY OF STATISTICS, 


Table -Showing the Frcqnennj distnhutions of Statures for Adult 
Mules horn *in England^ It eland ^ Scotland^ and finales. Final Repot t of 
the Anthropometric Committee to the British Association {Repot 1883, 
p, 256. ) As Measurements arc stated to have been taken to the near est 
ith of an Inch, the Class- Intervah are here presumably 5615-5711, 
57H-58li, and so on (of, § 9) See Fig 6. 


Height without 
shoes, Inches. 

Number of Men within said Limits of Height 
Place of Biith — 

Total, 

England 

Scotland. 

Wales 

1 1 eland 

57- 

1 

_ 

1 



2 

58- 

3 

1 

: — 

— 

4 

59- 

12 

— 

1 

1 

14 

60- 

39 

2 

— 

— 

41 

61- 

70 

2 

9 

2 

83 

62- 

128 

9 

30 

2 

169 

63- 

320 

19 

48 

7 

394 

64- 

624 

47 

83 

15 

669 

65- ^ 

740 

109 

108 

33 

990 

66- 

881 

139 

145 

58 

1223 

67- 

918 

210 

128 

73 

1329 

68- 

886 

210 

72 

62 

1230 

69- 

753 

218 

52 

40 

1063 

70- 

473 

115 

33 

25 

646 

71- 

254 

102 

21 

15 

392 

72- 

117 

69 

6 

10 

202 

73- 

48 

26 

2 

3 

79 

74- ; 

18 

15 

1 

— 

32 

76- 

9 

6 

1 

— 

16 

76- 

1 

4 

— 

— 

6 

77- 

1 

1 

— 

— 

2 

Total 

6194 

1304 

741 

346 

8586 


for persons born in England, Scotland, Wales, and 1 1 eland, and 
totalled in the last column These frequency-distributions are 
approximately of the symmetrical type The ficquency-polygon 
for the totals given by the last column of the table is shown 
m fig. 6 The student will notice that an error of yV inch, 
scarcely appieciable m the diagiam on its reduced scale, is neglected 
in the scale shown on the base-line, the intervals being treated 
as if they were 57-58, 58-59, etc Diagrams should be drawn for 
comparison showing, to a good open scale, the separate distiibutions 
for England, Scotland, Wales, and Ireland 





VI. — THE FREQUENCY-DISTRIBUTION. 


89 




Fig. 6 .— Frequency-distnbution of Stature for 8585 Adult Males bora in 
the Bntish Isles. (Table VI.) 


90 


THEORY OF STATISTICS, 


Table VII. gives two similar distributions from more recent 
investigations, relating respectively to sons over 18 ycais of 
age, with parents living, m Great Britain, and to students at 
Cambridge The polygons are shown in figs. 7 and 8 Both these 
distributions are more irregular than that of fig. 6, but, roughly 
speaking, they may all be held to be approximately symmetrical. 

14. The moderately asymmetrical distrihutiony the class-fre- 
quencies decreasing with markedly greater rapidity on one side of 
thelmaxlmum than on the other, as in fig. 9 (a) or (h) This is 
the most common of all smooth forms of frequency-distribution, 
illustrations occurring in statistics from almost every source The 
distribution of death-rates m the registration districts of England 


Table VII. — Showing the Frequency-distriluiion of Statures for (1) 1078 
English Sons (Karl Pearson, EionietriJca, ii , 1903, p 415) , (2) for 1000 
Male Students at Cambridge (W. R. Macdonell, Biomeirikay i., 1902, 
p. 220). See Figs 7 and 8 


Stature in 
Inches 

Number of Men within said 
Limits of Stature. 

(1) 

English Sons. 

(2) 

Cambridge 

Students. 

69 5-60 6 

2 0 

_ 

60 6-61 5 

1*5 

— 

61 6-62*5 

3*6 

4*0 

62 6-63 6 

20*6 

19 0 

63 5-64 6 

38*5 

24 5 

64*6-65 6 

61*6 

40 5 

65 5-66 5 

89*5 

84 6 

66 6-67 6 

148*0 

123 6 

67 6-68 '5 

173 5 

139 0 

68-5-69 5 

149 5 

179 0 

69 6-70 6 

128*0 

138*5 

70-6-71 -6 

108 0 

108 0 

71-5-72-5 

63 0 

63*5 

72 5-73-5 

42 0 

47*5 

73 6-74-5 

29 0 

21*0 

74 6-76 6 

8 6 

12 0 

75 6-76-6 

4 0 

5 0 

76-5-77 5 

4 0 

0*5 

77 5-78 6 

3*0 

— 

78-6-79 6 

0 5 

— 

Total 

1078 

1000 





VI — THE EREQUENCY-DISTEIBUTION. 


91 



SCcLture tn uncTtes 

Fig 7, — Frequency distribution of Stature for 1078 “ English Sons,’’ 
(Table VII.) 



Stature i/i vnches 

Fig, 8, — Frequency-distribution of Stature for 1000 Cambridge 
Students, (Table VII ) 



92 


THEORY OF STATISTICS 


and Wales, given m Table I , p 77, is a somewhat rough example 
of the type The distribution of rates of pauperism in the same 


W (a.) 



fiG 9 — Ideal distiibutions of the inodeiatelyasymmptncal form. 

districts (Table VIII and fig 10) is smoothei and more like the 
type (a) of fig 9. The frequency attains a maxinium for 



Fercentage oF the papuLaJtioTt trv receipt oF retueF . 


Fig 10 — Frequency-distnbution of Pauperism (Percentage of the Population 
in Eeceipt of Pool law Relief) on 1st Januaiy 1891 in the Registration 
Distncts of England and Wales . 632 Districts. (Table VIII ) 



VI — THE FEEQUBNCY-DISTRIBUTION. 


93 


districts with 2f to 3^ per cent of the population in receipt of 
relief, and then tails off slowly to unions with 6, 7, and 8 per 
cent, of pauperism 


Table VIII — Showing the Numler of Registration Districts in England and 
Wales with Different Percentages of the Population in receipt of Pool daw 
Relief on the \st January 1891. (Yule, Jowr» Roy Stat Soc , voL liz. , 
1896, p 347 q v foi distnbutions foi eaiher years.) See Fig. 10. 


Percentage of 
the Population 
in receipt of 
Relief 

Number of 
CTmons with 
given Percent- 
age in receipt 
of Relief. 

0 75-1 25 

18 

1 25-1 76 

48 

1 75-2 25 

72 

2 25-2 75 

89 

2 75-3 25 

100 

3 25-3-75 

90 

3 75-4 25 

75 

4 25-4 75 

60 

4 75-5 25 

40 

5 25-5 76 

21 

6 75-6 25 

11 

6 25-6 75 

5 

6 75-7 25 

1 

7*25-7 75 

1 

7 75-8 25 

’ 0 

8 25-8 75 

1 

Total 

632 


While the distiihution of stature is in general symmetrical, that 
of weight IS asymmetrical or shew, the gieater frequencies lying 
towards the lower end of the range This is shown very well by 
the data (Table IX. and fig 11) collected by the same British 
Association Committee, from the Eeport of which the data as to 
stature were cited m the last section As m the case of the stature 
diagram (fig 6), the small error of ^ lb. has been neglected, for 
the sake of brevity, in letteiing the base-line of fig 11, the classes 
being treated as if they were 90 lb -100 lb, 100 lb -110 lb, 
and so on 

Table X and fig 12 give a biological illustration, viz the 
distribution of fecundity (ratio of yearling foals produced to 
coverings) m mares. The student should notice the difiBculty 




94 


THEORY OF STATISTICS. 


mo 

I 

^JiOO 

S 

S eoo 
fe 

< 51 . 

400 


0 

65 JOS 1Z5 145 165 165 205 225 245 265 265 

Weight vtv "Cbs 

Fig. 11. — Frequency- distribution of Weight for 7749 Adult Males in 
the British Isles. (Table IX. ) 


I 




Fig. 12. — Frequency-distiibution of Fecundity for Brood mares •. 
2000 observations, (Table X) 


VI. — THE FREQUENCY-DISTRIBUTION. 


95 


Table IX. —Showing the Frequency-distHhdion of Weights for Adult Males 
hom in England i Ireland^ Scotland^ and Wales {Loc cit , Table VL) 
Weights were taken to the nearest pounds conseguently the true Class- 
Intervals are 89 5-99*5, 99 6-109*5, etc. (§ 9). 


« 

Weight 
m lbs. 

Number of Men within given Limits of 
Weight Place of Birth — 

Total. 

England. 

Scotland 

Wales 

Ireland. 

90- 

2 




2 

100- 

26 

1 

2 

5 

34 

no- 

133 

8 

10 

1 

152 

120- 

338 

22 

23 

7 

390 

130- 

694 

63 

68 

42 

807 

140- 

1240 

173 

153 

57 

1623 

150- 

1075 

255 

178 

51 

1659 

160- 

881 

275 

134 

36 

1326 

170- 

492 

168 

102 

25 

787 

180- 

304 

125 

34 

13 

476 

190- 

174 

67 

14 

8 

263 

200- 

75 

24 

7 

1 

107 

210- 

62 

14 

8 

1 

85 

220- 

33 

7 

1 

— 

41 

230- 

10 

4 

2 

— 

16 

240- 

9 

2 

— 

— 

11 

250- 

3 

4 

1 

— 

8 

260- ! 

1 

— 1 

— 

— 

1 

iS/U- 

280- 

— 

— 

1 

— 

1 

Total 

5552 

1212 

738 

247 

7749 


of classification m tins case the class- inteival chosen throughout 
the middle of the range is 1/1 Sth, but the last mterval is 
“ 29/30-1 ” This IS not a whole interval, but it is more than a 
half, for all the cases of complete fecundity aie reckoned into the 
class In the diagram (fig 12) it has been reckoned as a whole 
class, and this gives a smooth distribution. 

To take an illustration from meteorology, the distribution of 
barometer heights at any one station over a period of time is, m 
general, asymmetrical, the most frequent heights lying towards the j 
upper end of the range for stations in England and Wales, j 
Table XI and fig. 13 show the distribution for daily observations 
at Southampton during the years 1878-90 inclusive 

The distributions of Tables VIII -XI all follow more or less the 
type of fig. 9 (a), the frequency tailing off, at the steeper end of 




96 


THEORY OF STATISTICS. 


Table X. — Showing the Freqwewnj distribution of Fecundity^ i.e the Ratio 
of the Number of Yearling Foals produced to the Number of Coverings^ 
for Brood-mares {Race hoises) Covered Eight Times at Least, (Pearson, 
Lee, and Mooie, Fhil. Trans , A, vol cxcii (1899), p 303 ) See Fig 12. 


Fecundity. 

Number of 
Males with 
Fecundity 
between the 
Given Limits. 

Fecundity. 

Number of 
Maies with 
Fecundity 
between the 
Given Limits. 

1/30- 3/30 

2 

17/30-19/30 

315 

3/30- 5/30 

7 5 

19/30-21/30 

337 

5/30- 7/30 

11 6 

21/30-23/30 

293*5 

7/30- 9/30 

21 5 

23/30-25/30 

204 

9/30-11/30 

55 

25/30-27/30 

127 

11/30-13/30 

104 5 

27/30-29/30 

49 

13/30-15/30 

182 

29/30-1 

19 

15/30-17/30 

271 5 

Total 

2000 0 


Table XI — Showing the Frequency dish ihution of Baiometer Heights for 
Daily Observations duiing the Thirteen Years 1878-1890 at Southampton 
(Karl Pearson and A Lee, Phil Tians , A, vol cxc (1897), p 428, q,v, 
for numerous other distributions ) See Fig 13. 


Height of 
Barometer 
in Inches. 

Number of Days 
on which Height 
was obseived 
between the 
Given Limits, 

Height of 
Barometer 
m Inches. 

Number of Days 
on which Height 
was observed 
between the 
Given Limits. 

28 46-28 55 

1 

29 85- *96 

648*6 

55- 65 

2 

95-30 06 

602 *5 

65- 75 

2 

30-05- 15 

619 6 

•75- 85 

4 

15- 25 

500 

85- 95 

8 5 

•25- 36 

382 

•95-29 05 

13 6 

36- *45 

1 237*6 

29 05- -15 

21 6 

•4.5- -55 

1 189 6 

15- 25 

37 

*65- ‘65 

88*5 

25- -35 

79 

6*6- 75 

43 6 

35- 45 

108 

! 75- -85 

7 

•45- 55 

181*6 

85- 96 

4 

*55- 65 

254 5 

30 95-31 05 

1 

•66- -75 

348 5 



•75- *85 

463 5 

Total 

00 







98 


THEORY OF STATISTICS. 


the distribution, m such a way as to suggest that the ideal 
curve is tangential to the base Cases of gi eater asymmetiy, 
suggesting an ideal curve that meets the base (at one end) at a 
finite angle, even a right angle, as m fig. 9 (6), ai'e less fiequent, 
but occur occasionally. The distribution of deaths from diphtheria, 
according to age, affords one such example of a more asymnietiical 
kind The actual figures for this case are given in Table XIL, and 
illustrated by fig. U, and it will be seen that the frequency of 
deaths reaches a maximum for children aged “3 and under 4,” 
the number rising very rapidly to the maximum, and thence 
falling so slowly that there is still an appreciable frequency for 
persons over 60 or 70 years of age 

Table XII — Showing the Numbers of Deaths from Diphtheria at Differeni 
Ages in England and Wales during the Ten Years 1891-1900 {Supple- 
ment to 6Uh Annual Report of the Reguti ar-General^ 1891-1900, p. 3.) 
See Fig 14. 


Age in Years 

Number of 
Deaths between 
Given Limits 
of Age. 

Number 
per Annum. 

' 

Under 1 year 

4,186 

4,186 

1- 

10,491 

10,491 

2- 

11,218 

11,218 

8- 

12,390 

12,390 

4- 

11,194 

11,194 

5- 

28,348 

4,670 

10- 

4,092 

818 

15- 

1,123 

225 

20- 

685 

117 

25- 

786 

79 

35- 

612 

61 

46- 

324 

32 

56- 

260 

26 

65- 

127 

13 

76 and upwards 

35 

? 

Total 

80,671 

— 


15 The extremely asymmetrical^ or J-shapedy^ distribution^ the 
class-frequetioies running up to a maximum at one end of the 
range, as in fig 15 

This may be regarded as the extreme form of the last distribution, 
from which it cannot always be distinguished by elementary 
methods if the original data are not available. If, for instance, 
the frequencies of Table XIL had been given by five-year intervals 








VI — ^THE FEEQUBNCY-DISTKIBUTION. 


99 


only, they would have run 49,479, 23,348, 4,092, and so on, 
thus suggesting a maximum number of deaths at the beginning 
of hfe, ^ 6. a distribution of the present type It is only the 
analysis of the deaths in the earlier years of life by one-year 
intervals which shows that the fiequency reaches a true maximum 
m the fourth year, and therefore the distribution is of the 
moderately asymmetrical type. In practical cases no hard and 



Fig 15 — An ideal Distribution of the extreme Asymmetiical Foim. 


fast line can always be drawn between the moderately and 
extremely asymmetrical types, any more than between the 
moderately asymmetrical and the symmetrical type. 

In economic statistics this form of distribution is particularly 
characteristic of the distribution of wealth in the population at 
large, as illustrated, e,g , by income tax and house valuation returns, 
by returns of the size of agricultural holdings, and so on (c/. ref 4) 
The distributions may possibly be a very extreme case of the last 
type ; but if the maximum is not absolutely at the lower end of the 



100 


THEORY OF STATISTICS. 


range, it is very close indeed thereto Official returns do not 
usually give the necessaiy analysis of the frequencies at the 
lower end of the range to enable the exact position of the maximum 
to be determined^ and for this leason the data on which Table 
XIII. is founded, though of couise very unreliable, aie of some 
interest. It will be seen from the table and fig. 16 that with the 
given classification the distribution appeals cleaily assignable to 
the present type, the number of estates between zeio and £100 
in annual value being moi'e than six times as great as the number 
between £100 and £200 in annual value, and the frequency 
continuously falling as the value inci eases A close analysis of 
the first class suggests, however, that the gieatest frequency does 
not occur actually at zero, but that there is a true maximum 
frequency for estates of about £1 15 0 m annual value The 
distribution might therefore be more correctly assigned to the 
second type, but the position of the greatest frequency indicates a 


Table XI II — Showing the Nimhers and Annual Values of the Estates of 
those who had taken part %n the Jacobite Mismg o/ 1716 (Compiled from 
Cosin’s JJames of the Roman Qathohes^ Nougtiroi's, and others who refused 
to take the Oaths to Jm late Majesty Kmy Oeon/e^ etc. , London, 1746 
Figures of very doubtful absolute value See a note m Southey’s 
Commonplace Eookf vol u p. 573, quoted fiom the Memoirs of T. Hollis.) 
See Fig 16 


Annual 
Value m 
JCIOO. 

Number of 
Estates 

Animal 
Yalue in 
£100. 

Number of 
Estates 

0- 1 

1726'6 

17-18 

1 

1- 2 

280 

— 


2- 3 

140*5 

20-21 


3- 4 

87 

21 ~22 


4- 6 

46 5 

22-23 


6- 6 

42 5 

‘23-24 


6 7 

29 5 

— 



7- 8 

25*5 

27-28 

2 

8- 9 

18 6 

— 



9-10 

21 

Sl-32 

1 


11 5 




11-12 

9*5 

39-40 

1 

12-13 

4 




13-14 

3 5 

45-46 

1 

14-16 

8 





15-16 

3 

48-49 

1 

16-17 

1 5 



1 

Total 

2476 













VI. — THE FKBQUSNOY-mSTRIBUTION. 


101 


degree of asymmetry that is high even compared -with the 
asymmetry of fig. 14 the distribution of numbeis of deaths from 



Fig. 16. — Frequency distribution of the Annual Values of certain Estates 
in England in 1715 . 2476 Estates. {Table XIIL) 


diphtheria would more closely resemble the distribution of estate- 
values if the maximum occurred m the fourth and fifth weeks 
of life instead of m the fourth year. The figures of Table IT., 
p. 83, showing the annual value and number of dwellmg-bouses, 



102 


THEOKT OF STATISTICS. 


afford a good illustration of this form of distribution, but marred 
by the unequal intervals so common in official returns 


Table XIV — SJwvnng the Frequeimies of Different Ntimbers of Petals for 
Three Series of Ranunculus bulbosus (H. de Vues, Bor. dtsch hot. Ges , 
Bd. XU., 1894, q v for details ) See Fig 17 


Number 
of Petals. 

Frequency 

Senes A 

Senes B. 

Senes 0. 

5 

312 

345 

133 

6 

17 

24 

65 

7 

4 

7 

23 

8 

2 

— 

7 

9 

2 

2 

2 

10 

— 



2 

11 


2 

— 

Total I 

337 

380 

222 


The type is not very frequent in other classes of material, but 
instances occur here and there Table XIV. and fig 17 show 



Fig 17.— Frequency distributions of Numbers of Petals for Three Series of 
Ranunculus hulhosm A 337, B 380, G 222 observations (Table XIV.) 


distributions of this form for the petals of the buttercup, Bomunr 
cuius bulbosus. 

16. The Xl-shuped dist/nbutioTiy exhibiting a maximum frequency 




VI.— THE FKIQUENCT-DISTRIBUTION. 


103 


at the ends of the range and a minimum towards the centre. 
The ideal form of the distribution is illustrated by fig 



This is a rare but interesting form of distribution, as it stands 
m somewhat marked contrast to the preceding forms Table XY. 
and fig 19 illustrate an example based on a considerable number 
of observations, viz. the distribution of degrees of cloudiness, or 
estimated percentage of the sky covered by cloud, at Breslau 


Table XV. — Shmoing tJie Frequencies of Estimated Intensities of Cloudiness 
at Breslau during the Ten Years 1876-85. (See ref 2 ) See Fig. 19 


Cloudiness 

Frequency. 

Cloudiness 

Frequency 

0 

751 

6 

21 

1 

179 

7 

71 

2 

107 

8 

194 

3 

69 

9 

117 

4 ! 

46 

10 

2089 

5 

9 

Total 

3663 





104 


THEORY OF STATISTICS. 


during the years 1876-85 A sky completely, or almost com- 
pletely, overcast at the time of obseivation is the most common, 
a practically clear sky comes next, and mtei mediates ai'e moie 

This form of distribution appears to be sometimes exhibited by 
the percentages of offspring possessing a ceitam attribute when one 
at least of the parents also possesses the attribute. The remarks 



Fio 19. — Frequency-distribution of Degrees of Cloudiness at Breslau 
18/6-86 : 8668 observations (Table XV ) 


of Sir Francis Gal ton in Natural Inheritance suggest such a 
form for the distribution of “ consumptivity amongst the off- 
spring of consumptives, but the figures are not in a decisive shape 
Table XVI gives the distribution for an analogous case, viz. the 

Table XVI — Showing the Percentages of JOeaf mutes among Ohildten of 
Parents one of whom at least was a Deaf-mute, for Marriages producing 
Five Children or more, (Conapiled from material in Marriages of the Deaf 
inAmericaj cd E A Fay, Volta Bureau, Washington, 1898.) 


Percentage 

of 

Deaf-mutes 

Number of 
Families 

Percentage 

of 

Deaf-mutes 

Number of 
Families 

0-20 

220 

60-80 

6 6 

20-40 

20*6 

80-100 

16 

40-60 

12 

Total 

273 




YI — THE FREQUENCY-DISTRIBUTION. 


105 


distribution of deaf-mutism amongst the offspring of parents one 
of whom at least was a deaf mute In general less than one-fifth j 
of the children are deaf-mutes at the other end of the range the > 
cases m which over 80 per cent of the children are deaf-mutes are 
nearly three times as many as those in which the percentage hes 
between 60 and 80 The numbers are, however, too small to form 
a very satisfactory illustration. 

REFERENCES 

(1) Peaeson, Kael, **Skew Yanation m Homogeneous Mateml,” Phil 

Trans Roy Soe , Senes A, vol. clxxxvi (1895), pp 343-414 

(2) Peaeson, Kael, ** Cloudiness : Note on a Novel Case of Eiequency,’* 

Proc, Roy, Soc.^ vol. Ixu. (1897), p 287. 

(3) Peaeson, Kael, “Supplement to a Memoir on Skew Yanation,” Phil, 

Trans. Roy Soc,^ Senes A, vol cxcvu. (1901), pp. 443-459. 

(4) Paeeto, Yilfeedo, Coursd^ Economic politique I 2 vols. , Lausanne, 1896-7. 

See especially tome n., livre m., chap i, “La courbe des revenus.” 

The first three memoiis above are mathematical memoirs on the theory 
of ideal frequency-cuiwes, the first being the fundamental memoir, ana 
the second and third supplementary The elementary student may, 
however, refei to them with advantage, on account of the large collection 
of frequency-distributions which is given, and from which some of the 
illustiations in the preceding chapter have been cited "Without 
attempting to follow the mathemalucs, he may also note that each of 
our rough empirical types may be divided into several sub-types, the 
theoretical division into types being made on different grounds 

The fourth work is cited on account of the author’s discussion of the dis- 
tribution of wealth in a community, to which reference was made in § 15 
In connection with the remarks in § 6, on the grouping of ages, 
reference may be made to the following m w hich a different conclusion 
IS drawn as to the best grouping * — 

(5) Young, Allyn A , “ A Discussion of Age Statistics,” Census Bulletin 

Bureau of the Census, Washmgton, USA., 1904 
Reference should also be made to the Census of England and Wales^ 
1911, vol. vu , “Ages and Condition as to Marnage,” especially the 
Report by Mr George King on the graduation of ages. 

EXERCISES. 

1. If the diagram fig. 6 is rediawn to scales of 300 observations per interval 
to the inch and 4 inches of stature to the mch, what is the scale of observa 
tions to the square inch ? 

If the scales are 100 ohseivations per interval to the centimetre and 2 inches 
of statuie to the centimetre, what is the scale of observations to the 
square centimetre ? 

2 If fig. 1 0 IS redrawn to scales of 25 observations per interval to the mch and 
2 per cent to the inch, what is the scale of observations to the square inch ^ 

If the scales are ten observations per mterval to the centimetre and 1 percent 
to the centimetre, what is the scale of observations to the square centimetre t 
3. If a frequency-polygon be drawn to represent the data of Table I , what 
number of observations will the polygon show between death-rates of 16 5 
and 17*6 per thousand, instead of the true number 159 ^ 

4 If a frequency-polygon be drawm to represent the data of Table Y., 
what number of observations will the polygon show between head-breadths 
5 ‘95 and 6*05, instead of the true number 236 1 



CHAPTER VII. 

AVERAGES. 

1. Necessity for quantitative definition of the characters of a frequency- 
distribution — 2 Measures of position (avei ages) and of dispersion—S. 
The dimensions of an average the same as those of the variable — 4 
Desirable properties for an average to possess — 5 , The commoner forms 
of average— 6-13. The arithmetic mean • its definition, calculation, and 
simpler properties — 14-18 The median its definition, calculation, and 
simpler properties — 19-20 The mode its definition and relation to 
mean and median — 21 Summary comparison of the preceding forms 
of average — 22-26 The geometric mean its definition, simpler pro- 
perties, and the cases in which it is specially applicable — 27. The 
harmonic mean : its definition and calculation. 

1 In § 2 of the last chapter it was pointed out that a classification 
of the obserTations in any long series is the first step necessary 
to make the observations comprehensible, and to render possible 
those comparisons with other series which aic essential for any 
discussion of causation. Very little experience, however, would 
show that classification alone is not an adequate method, seeing 
that it only enables qualitative or verbal comparisons to bo made. 
The next step that it is desirable to take is the quantitative 
definition of the characters of the frequency-distribution, so that 
quantitative comparisons may be made between the corresponding 
characters of two or more series. It might seem at first sight 
that very difficult cases of comparison could arise m which, for 
example, we had to contrast a symmetrical distribution with a “ J- 
shaped ” distribution. As a matter of practice, however, we seldom 
have to deal with such a case , distributions drawn from similar 
material are, m general, of similar form When we have to 
compare the frequency-distributions of stature m two races of 
man, of the death-rates in English registration districts in two 
successive decades, of the numbers of petals m two races of the 
same species of Ranunculus, we have only to compare with each 
other two distributions of the same or nearly the same type 
2. Confining our attention, then, to this simple case, there are 
two fundamental characteristics in which such distributions may 

106 



YH. — AVEBAGES. 


107 


differ (1) they may differ markedly in position, ^ <?. in the values 
of the variable round which they centre, as in fig. 20, A, or (2) 
they may centre round the same value, but differ in the range of 
variation or dispersion, as it is termed, as in fig 20,-5 Of course 
the distributions may differ in both characters at once, as in fig 20, 
G, but the two properties may be considered independently. 
Measures of the first character, position, are generally known as 
^ averages ; measures of the second are termed measures of disper- 
^ sion ‘ In addition to these two principal " and fundamental 
characters, we may also take a third of some interest but of much 
less importance, viz. the degree of asymmetry of the distribution. 



Fig. 20 


The present chapter deals only with averages; measures of 
dispersion are considered in Chapter YIII. and measures of 
asymmetry are also briefly discussed at the end of that chapter 
3 In whatever way an average is defined, it may be as well to 
note, it IS merely a certain value of the vanable, and is therefore 
necessarily of the same dimensions as the variable : i.e if thef 
variable be a length, its average is a length , if the vanable be a \ 
percentage, its average is a percentage, and so on But there are 
several different ways of approximately defining the position of a 
frequency-distribution, that is, there are several different forms of 
average, and the question therefore arises, By what criteria are we 
to judge the relative merits of different forms What are, in fact, 
the desirable properties for an average to possess ? 


108 


THEOEY OF STATISTICS. 


4. (a) In the first place, it almost goes without saying that an 
average should be rigidly defined, and not left to the meio estimation 
of the observer An average that was meiely estimated would 
depend too largely on the observer as well as the data (6) An 
average should be based on all the observations made If not, 
it is not really a characteristic of the whole distribution, (c) It 
IS desirable that the average should possess some simple and 
obvious properties to render its geneial nature readily compre- 
hensible : an average should not be of too abstract a mathematical 
character, {d) It is, of course, desirable that an average should 
be calculated with reasonable ease and rapidity Other things 
being eq^ual, the easier calculated is the better of two forms of 
average. At the same time too great weight must not be attached 
to mere ease of calculation, to the neglect of other factors, (e) 
It is desirable that the average should be as little affected as 
may be possible by what we have termed fluctuations of sauvphng. 
If different samples be drawn fiom the same material, however 
carefully they may be taken, the averages of the different samples 
will rarely be <juite the same, but one form of average may show 
much greater diffeiences than another. Of the two forms, the 
more stable is the better The full discussion of this condition 
must, however, be postponed to a later section of this work 
(Chap. XVII ) if) Finally, by far the most important desideratum 
is this, that the measure chosen shall lend itself readily to 
algebraical treatment. If, eg ^ two or more senes of observations 
on similar material are given, the average of the combined series 
should be readily expressed m terms of the averages of the 
component series , if a variable may be expressed as the sum of 
two or more others, the average of the whole should he readily 
expressed m terms of the averages of its parts. A measure for 
which simple relations of this kind cannot be readily determined 
is likely to prove of somewhat limited application 

5. There are three forms of average in common use, the 
anthmetic mean, the median, and the mode, the first named being 
by far the most widely used in general statistical work. To 
these may be added the geometric mean and the harmonic mean, 
more rarely used, but of service m special cases. We will con- 
sider these in the order named. 

' anthmetic mean — The arithmetic mean of a series of 

values of a* 7ariahle ATg, Xg, . . . X„, X in number, is the 

quotient of the sum of the values by their number. That is to 

^ ^say, if M be the arithmetic mean, 

+ + . . . +Z„), 



VII. — AVIEAGES. 


109 


or, to express it more briefly by using the symbol S to denote 
“ the sum of all quantities like,” 

. . . . ( 1 ) 

The word mean or average alone, without qualification, is very^ 
generally used to denote this particular form of average : that ^ 
IS to- say, when anyone speaks of “the mean ” or “the average”,’ 
of a senes of observations, it may, as a rule, be assumed that the 
arithmetic mean is meant. It is evident that the arithmetic 
mean fulfils the conditions laid down in (g) and (6) of § 4, for it 
IS rigidly defined and based on all the observations made 
Further, it fulfils condition (c), for its general nature is readily 
comprehensible. If the wages-bill for N woikmen is ^P, the 
arithmetic mean wage, P/iT pounds, is the amount that each 
would receive if the whole sum available were divided equally 
between them : conversely, if we are told that the mean wage 
IS we know this means that the wages-bill is iTif pounds. 
Similarly, if N families possess a total of Q children, the mean 
number of children per family is GjN — the number that each 
family would possess if the children were shared unifoimly. 
Conversely, if the mean number of children per family is JT, the 
total number of children m N families is NM. The arithmetic 
mean expresses, in fact, a simple relation between the whole 
and its parts. 

7 As regards simplicity of calculation, the mean takes a high 
position In the cases just cited, it will be noted that the mean 
is actually determined without even the necessity of determining 
or noting all the individual values of the variable : to get the 
mean wage we need not know the wages of every hand, but only 
the wages-bill ; to get the mean number of children per family 
we need not know the number in each family, but only the total. 
If this total is not given, but we have to deal with a moderate 
number of observations — so few (say 30 or 40) that it is hardly 
worth while compiling the frequency-distribution — ^the arithmetic 
mean is calculated directly as suggested by the definition, i.e. ' 
all the values observed are added together and the total divided » 
by the number of observations. But if the number of observations ^ 
be large, this direct process becomes a little lengthy. It may- 
be shortened considerably by forming the frequency-table^'and 
treating all the values m each class as if they were identical with 
the mid-value of the class-interval^ a process which in general 
gives an approximation that is quite sufficiently exact for prac- 
tical purposes if the class-interval has been taken moderately 



110 


THBOET OF STATISTICS. 


small (c/ Chap. VI § 6). In this process each class-frequency 
IS multiplied hy the mid-value of the intervalj the products added 
together, and the total divided by the number of observations. 
If /denote the frequency of any class, X the mid- value of the 
corresponding class-interval, the value of the mean so obtained 
may be written — 

i/=is(/.i) . . . . (2) 

8. But this procedure is still further abbreviated in practice 
by the following artifices —(1) The class-interval is treated 
as the unit of measuiement throughout the arithmetic; (2) the 
difiference between the mean and the mid-value of some arbi- 
trarily chosen class-interval is computed instead of the absolute 
value of the mean 

If A be the arbitrarily chosen value and 

X^A + $. .... (3) 

then 

or, since J. is a constant, 

+ . . . . (4) 

The calculation of ^(fX) is therefore replaced by the calcula- 
tion of 2(/ The advantage of this is that the class-frequencies 
need only be multiplied by small integral numbers, for A 
being the mid-value of a class-interval, and X the mid-value of 
another, and the class-interval being treated as a unit, the fs 
must be a series of integers proceeding from zero at the arbitrary 
origin A, To keep the values of $ as small as possible, A should 
be chosen near the middle of the range. 

It may be mentioned here that 2(^), or for the grouped 

distribution, is sometimes termed the moment of the distribu- 
tion about the arbitrary origin A : we shall not, however, make 
use of this term 

9. The process is illustrated by the following example, using 
the frequency-distribution of Table VIII , Chap VI The 
arbitrary origin A is taken at 3*5 per cent , the middle of the 
sixth class-interval from the top of the table, and a little neaier 
than the middle of the range to the estimated position of the 
mean The consequent values of $ are then written down as in 
column (3) of the table, against the corresponding frequencies, the 
values starting, of course, from zero opposite 3 5 per cent Each 
frequency / is then multiplied by its ^ and the products enteied 



Vn. — AYEKAGES. 


Ill 


m another column (4). The positive and negative products are 
totalled separately, giving totals -776 and +509 respectively, 
whence S(/ i) — - 267 Dividing this by JY, viz 632, we have 
the difference of M from A in class-intervals, viz 0*42 intervals, 
that IS 0 21 per cent. Hence the mean is 3 5 -0 21= 3*29 
per cent. 


Calculation of the Mean Example i. — Calculatim of the Artth/rmiic 
Mian of the Percentages of the Population %n recei'pt of Reliefs from the 
Figures of Table Fill , Chap VI ^ p 93 


(1) 

Mid-values 
of the 

Class -intervals 
(Percentage in 
receipt of 
Relief). 

(2) 

Fiequency 

/. 

(3) 

Deviation 
from Arbitrary 
Value A 

I* 

( 4 ) 

Product 

/!• 

1 

IS 

- 5 

90 

1 5 

48 

- 4 

192 

2 

72 

- 3 

216 

2 5 

89 

- 2 

178 

8 

100 

- 1 

100 

3 5 

90 

0 

-776 

4 

75 

+ 1 

75 

4 6 

60 

+ 2 

120 

5 

40 

+ 3 

120 

5 5 

21 

+ 4 

84 

6 

11 

+ 5 

55 

6*5 

5 

+ 6 

30 

7 

1 

+ 7 

7 

7 6 

1 

+ 8 

8 

8 

— 

+ 9 

— 

8 6 

1 

+ 10 

10 

Total 

632 

— 

+ 509 


2(/|)=+509-776= -267 
267 

Jf- ^ class-intervals = - 0 42 class-intervals 

= - 0 21 units 

mean 3^=3 5 -0 21= 3 29 per cent. 


It must always be remembered that %{/ ^)l If gives the value of 
M-A m class-intervals, and must not be added directly to A 
unless the interval is also a unit In the present illustration the 




112 


THEORY OF STATISTICS. 


interval is half a unit, and accordingly the quotient 267/632 is 
halved in order to obtain an answer in units Care must also be 
taken to give the right sign to the quotient 

10 As the process is an important one we give a second illustra- 
tion from the figures of Table VI , Chap VI In this case the class- 
interval IS a unit (1 inch), so the value of M-Ais given directly 
by dividing 2(/0 ^7 ^ student must notice that, measures 
having been made to the nearest eighth of an inch, the mid-values 
of the intervals are etc., and not 57 5, 58 5, ete 


Calculation of the Mean; Exa/m>'pl& ii . — Calculdtion of the ATithmetic 
Mean Stature of Male Adults %n thh Bmtish Isles from the Figures of 
Chap VI , TaUe VI , p- SS. 


(1) 

Height, 

Inches 

(2) 

Frequency 

/. 

(3) 

Deviation 
fiom Arbitral y 
Value A 

(4) 

Product 

/{. 

67- 

2 

-10 

20 

58- 

4 

- 9 

36 

59- 

14 

- 8 

112 

60- 

41 

- 7 

287 

61- 

83 

- 6 

498 

62- 

169 

- 5 

846 

68- 

394 

- 4 

1576 

64- 

669 

- 3 

2007 

65- 

990 

- 2 

1980 

66- 

1223 

- 1 

1223 

67- 

1329 

0 

-8584 

68- 

1230 

+ 1 

1230 

69- 

1063 

+ 2 

1 2126 

70- 

646 

4* 3 

1938 

71- ! 

392 

+ 4 

1568 

72- 

202 

+ 6 

1010 

73- 

79 

+ 6 

474 

74- 

32 

+ 7 

224 

75- 

16 

+ 8 

128 

76- 

6 

+ 9 

45 

77- 

2 

+ 10 

20 

Total 

8586 

- 

+ 8763 


S(/^) =+ 87 63 - 8584 =+ 179 
179 


AT- ^ ~ = + 02 class-intervals or inches. 

ooSo 


. •. if = + *02 = 67 46 inches. 




YH. — AVERAGES. 


113 


It IS evident that an absolute check on the arithmetic of any 
such calculation may be effected by taking a different arbitrary 
origin for the deviations * all the figures of col. (4) will be changed, 
but the value ultimately obtained for the mean must be the 
same The student should note that a classification by unequal 
intervals is, at best, a hmdiance to this simple form of calculation, 
and the use of an indefinite inteival for the extremity of the 
distribution renders the exact calculation of the mean impossible 
(c/. Chap VI § 10). 

11. We return again below (§ 13) to the question of the 



i^erccnta^e. of Uie popuZaivoih ith rocetpt of p&Ucf 


Fig 21 — Showing the Arithmetic Mean J/, the Median and the ilode Afe, 
by veiticals drawn through the corresponding pomts on the base, for the 
distribution of pauperism of fig 10, p 92. 


‘ errors caused by the assumption that all values within the same 
interval may be treated as approximately the mid-value of the 
interval. It is sufficient to say here that the error is in general 
very small and of uncertain sign for a distribution of the 
“^symmetrical or only moderately asymmetrical type, provided of 
course the class-mteival is not large (Chap. VI § 5). In the case 
of the “ J-shaped or extremely asymmetrical distribution, how- 
ever, the error is evidently of definite sign, for in all the intervals 
the frequency is piled up at the limit lying towards the greatest 
frequency, the lower end of the range in the case of the illustra- 
tions given in Chap VI , and is not evenly distributed over the 

8 



114 


THEOEY OF STATISTICS. 


interval. In distributions of such a type the intervals must be 
made very small indeed to secure an approximately accurate value 
for the mean The student should test for himself the effect of 
different groupings in two or three different cases, so as to get 
some idea of the degree of inaccuracy to be expected. 

12. If a diagram has been drawn representing the frequency- 
distribution, the position of the mean may conveniently be 
indicated by a vertical through the corresponding point on the 
^ base Thus fig 21 (a repi eduction of fig. 10) shows the frequency- 
polygon for our first illustration, and the vertical MM indicates 
the mean In a moderately asymmetrical distribution at all of 
this form the mean lies, as in the present example, on the side of 
* the greatest frequency towards the longer “ tail of the distribu- 



tion: Mm fig. 22 shows similarly the position of the mean in 
an ideal distribution In a symmetrical distribution the mean 
coincides with the centre of symmetry. The student should mark 
the position of the mean in the diagram of eveiy frequency dis- 
itribution that he diaws, and so accustom himself to thinking of 
Ithe mean, not as an abstraction, but always m relation to the 
jfrequency-distribution of the variable concerned. 

13 The following examples give important properties of the 
arithmetic mean, and at the same time illustrate the facility of its 
algebraic treatment . — 

/ (a) The sum of the deviations from the mean, taken with their 
pr^er^signs, is zero 

This follows at once from equation (4) : for if M and A are 
identical, evidently S(/ must be zero. 



VII.— AVEEAGES. 


115 


(b) If a series of iT observations of a variable X consist of, say, 
two component series, the mean of the whole senes can be 
readily expressed m terms of the means of the two components 
For if we denote the values in the first series by X-^ and in the 
second senes by 

that IS, if there be ifj observations m the first senes and m ' 

the second, and the means of the two series be 1/*^ M 2 respectively, 

+ Jfg . . (5) 

For example, we find from the data of Table VI , Chap. YI , 

Mean stature of the 346 men born m Ireland = 67*78 in. 

„ „ „ 741 „ „ Wales = 66*62 in 

Hence the mean stature of the 1087 men born m the two countries 
is given by the equation — 

1087 if = (346 X 67*78) + (741 x 66 62). 

That IS, if=66 99 inches It is evident that the form of the 
relation (5) is quite general if there are r series of observations 
JTj, ^2 . . . . Xr, the mean M of the whole series is related to 
the means i /2 . . i/„ of the component series by the 

equation 

W.lf=Wii/i + W2.i/2+ .... +FrMr . ^ . (6) 

For the convenient checking of arithmetic, it is useful to note 
that, if the same arbitraiy origin A for the deviations f be taken 
in each case, we must have, denoting the component series by the 
subscripts 1, 2, . . . r as before, 

S(/|) = 5(/i.fi) + S(/2ls)+ +2(/;4) ■ (7) 

The agreement of these totals accordingly checks the work. 

As an important corollary to the general relation (6), it may | 
be noted that the approximate value for the mean obtained from j 
any frequency distribution is the same whether we assume, (1) 
that all the values in any clas^ are identical with the mid-value 
of the class-interval, or (^tiat the mean of the values in the 
class IS identical with the kiid-value of the class-interval 

(c) The mean of all the sums or differences of corresponding 
obseivations in two series (of equal numbers of observations) is 
equal to the sum or difference of the means of the two series 

This follows almost at once For if 

z=Ari±X2, 

2(X) = S(X,)±2(X2). 



116 


XHEOKY OF STATISTICS, 


That zs, if if, Ifj^, if, be the respective means, 

M=M^±M^ .... (8) 

Evidently the form of this result is again quite geneial, so that 
if 

X^X^±X^± .... ±X„ 
if-ifi±if^± .... ±if, . . . (9) 

As a useful illustiation of equation (8), consider the cttse of 
measurements of any kind that are subject (as indeed all 
measures must be) to greater or less errors The actual measure- 
^ ment X m any such case is the algebraic sum of the true 
^ measurement X^ and an erior X^, The mean of the actual 
, measurements M is therefore the sum of the true mean ifj, and 
; the arithmetic mean of the errors', if g. If, and only if, the 
latter be zero, will the observed mean be identical with the true 
mean. Errors of grouping (§11) are a case in point. 

'^14. The median — The median maybe defined as the middle- 
mdstor central value of the vaiiable when the values are ranged 
m order of magnitude, or as the value such that greater and 
^ smaller values occur with equal frequency In the case of a 
frequency-curve, the median may be defined as that value of the 
variable the vertical through which divides the area of the cuive 
into two equal paits, as the veitical through Mi in fig 22. 

The median, like the mean, fulfils the conditions {h) and (c) 

. of § 4, seeing that it is based on all the observations made, and 
that it possesses the simple property of being the oential or 
middlemost value, so that its nature is obvious But the defini- 
tion does not necessarily lead m all cases to a determinate value. 
If there be an odd number of different values of X observed, say 
271-1-1, the (w4-l)th in order of magnitude is the only value 
fulfilling the definition But if there be an even number, say 
271 different values, any value between the 7ith and (7i-fl)th 
fulfils the conditions In such a case it appears to be usual to 
take the mean of the Tith and (Ti-f l)th values as the median, 
but this is a convention supplementary to the definition It 
should also be noted that in the case of a discontinuous variable 
the second form of the definition in general breaks down if we 
range the values m order there is always a middlemost value 
(provided the number of observations be odd), but there is not, as a 
rule, any value such that greater and less values occur with equal 
frequency. Thus in Table III , § 3 of Chap VI , we see that 45 per 
cent, of the poppy capsules had 12 or fewer stigmatic rays, 55 
per cent, had 13 or more^ similarly 61 per cent had 13 or fewer 
rays, 39 per cent had 14 or more. There is no number of rays 



VIL— AYEEAGES. 


117 


such that the frequencies in excess and defect are equal. 
In the case of the buttercups of Table XI Y. (Chap YI. § 15) 
there is no number of petals that even lemotely fulfils the 
lequired condition An analogous difficulty may arise, it may 
be remarked, even m the case of an odd number of observations 
of a continuous vaiiable if the number of observations be small 
and seveial of the observed values identical. The median is 
therefoie a form of average of most uncertain meaning m cases 
of stlictly discontinuous variation, for it may be exceeded by 
5, 10, 15, or 20 per cent only of the observed values, instead of 
by 50 per cent . its use in such cases is to be depiecated, and 
is pel haps best avoided in any case, whether the variation be 
continuous or discontinuous, m which small series of observations 
have to be dealt with 

15 When a table showing the frequency-distribution for a 
long series of obseiwations of a continuous variable is given, no 
difficulty arises, as a sufficiently approximate value of the median 
can be readily detei mined by simple interpolation on the hypo- 
thesis that the values m each class are uniformly distributed 
throughout the mteival Thus, taking the figures in our first 
illustiation of the method of ^calculating the mean, the total 
numhei of observations (registration districts) is 632, of which 
the half is 316 Looking down the table, we see that there are 
227 districts with not more than 2 75 per cent of the population 
m receipt of relief, and 100 more with between 2*75 and 3*25 
per cent. But only 89 are required to make up the total of 316 ^ 
hence the value of the median is taken as 

2 75-f | = 2 75-l-0*445 

= 3 195 per cent. 

The mean being 3 29, the median is slightly less ; its position 
is indicated by Ih in fig 21. 

The value of the median stature of males may be similarly 
calculated fiom the data of the second illustration The work 
may be indicated thus — 

Half the total number of observations (8585) = 4292*5 

Total fiequency under 66A|- inches . . =3589 

Difference . . . = 703 5 

Frequency m next interval . , =1329 

Therefore median = 66|4 + 

= 67*47 inches. 



118 


THEOKY OF STATISTICS. 


The difference between median and mean in this case is 
therefore only about one-hundredth of an inch, the smallness 
of the difference arising from the approximate symmetiy of 
the distribution. In an absolutely symmetiical distnbution 
it IS evident that mean and median must coincide 

16 Graphical interpolation may, if desired, be substituted 
for arithmetical interpolation Taking, again, the figiiies of 
Example i , the number of distiicts with paupeiism not exceeding 
2 25 IS 138, not exceeding 2 75, 227 , not exceeding 3 25, 327 ; 
and not exceeding 3 75, 417 Plot the numbers of districts 
with pauperism not exceeding each value X to the conespondmg 



Percentage, of ffie popitUxXzort 
uv receyot’ of reZuef 

Fig. 23. — Determination of the median by graphical interpolation. 

value of X on squaied paper, to a good large scale, as m fig 23, 
and draw a smooth curve through the points thus obtained, 
preferably with the aid of one of the ‘^curves,” splines, or flexible 
curves sold by instrument-makers for the purpose The point 
in which the smooth curve so obtained cuts the horizontal line 
corresponding to a total frequency N/2 = 316 gives the median. 
In general the curve is so flat that the value obtained by this 
graphical method does not differ appreciably from that calculated 
arithmetically (the arithmetical process assuming that the 
curve is a straight line between the points on either side of 
j the median) ; if the curvature is considerable, the graphical 
I value— assuming, of course, careful and accurate draughtsmanship 
1 — is to be preferred to the arithmetical value, as it does not 



VII — ^AYBEAGES. 


119 


involve the crude avssumption that the frequency is uniformly 
distributed over the interval m which the median lies 

17. A comparison of the calculations for the mean and 
for the median respectively will show that on the score of 
brevity of calculation the median has a distinct advantage. 
.When, however, the ease of algebraical treatment of the two 
, forms of average is compared, the superiority lies wholly on 

the side of the mean. As was shown in § 13, when several series 
of observations are combined into a single series, the mean of 
I the resultant distribution can be simply expressed in terms 
I of the means of the components The expression of the 
median of the resultant distnbution in terms of the medians 
of the components is, however, not merely complex and difficult, 
but impossible . the value of the resultant median depends on 
the" forms of the component distributions, and not on their 
medians alone If two symmetrical distributions of the same 
form and with the same numbers of observations, but with 
different medians, be combined, the resultant median must 
evidently (from symmetry) coincide with the resultant mean, t e 
lie halfway between the means of the components But if the 
two components be asymmetrical, or (whatever their form) 
if the degrees of dispersion or numbers of observations in the 
two series be different, the resultant median will not coincide 
with the resultant mean, nor with any other simply assignable 
value It is impossible, therefore, to give any theorem for 
medians analogous to equations (5) and (6) for means. It is 
equally impossible to give any theorem analogous to equations 
(8) and (9) of § 13. The median of the sum or difference of 
pairs of corresponding observations in two series is not, 
in general, equal to the sum or difference of the medians of 
the two series , the median value of a measurement subject to 
error is not necessarily identical with the true median, even 
if the median error be zero, te if positive and negative errors 
be equally frequent 

18. These limitations render the applications of the median m 
any work m which theoretical considerations are necessary com- 
paratively circumscribed On the other hand, the median may 
have an advantage over the mean for special reasons ^ (a) It is 
very readily calculated , a factor to which, however, as already 
stated, too much w^eight ought not to be attached {d) It is 
readily obtained, without the necessity of measuring all the 
objects to be observed, m any case in which they can be arranged 
by eye in order of magnitude If, for instance, a number of men 
be ranked in order of stature, the stature of the middlemost is 
the median, and he alone need be measured (On the other hand 



120 


THEORY OR STATiSXICS. 


' it is useless in the cases cited at the end of § 6 ^ the median wage 
cannot be found from the total of the wages-bill, and the total 
of the wages-bill is not known when the median is given ) (c) It 

IS sometimes useful as a makeshift, when the observations are so 
given that the calculation of the mean is impossible, owing, e g , to 
a final indefinite class, as in Table IV (Chap. VI. § 10) (d) The 

median maj/ sometimes be pieferable to the mean, owing to its 
being less affected by abnormally laige or small values of the 
variable. The stature of a giant would have no more mfiucnce 
\on the median stature of a number of men than the statuxe of 
any other man whose height is only just greater than the median 
If a number of men enjoy incomes closely clustering round a 
median of £500 a year, the median will be no more affected by 
the addition to the group of a man with the income of £50,000 
than by the addition of a man with an income of £5000, or even 
£600 If observations of any kind are liable to present occasional 
greatly outlying values of this soit (whether real, or due to 
errors or blunders), the median will be moxe stable and less 
affected by fluctuations of sampling than the axithmetic mean. 
(In general the mean is the less aftected ) The point is discussed 
more fully later (Chap XVII ) v (e) It may be added that the 
median is, in a ceitain sense, a particularly real and natuxal 
form of average, for the object or'Vndividual that is the median 
object or individual on any one system of measuring the chaiactei 
with which we are concerned will remain the median on any 
other method of measuiement which leaves the objects in the 
same relative order. Thus a batch of eggs representing eggs 
of the median price, when prices axe xeckoned at so much per 
dozen, will remain a batch xepresentmg the median pxice when 
prices are reckoned at so many eggs to the shilling, 

19 The Mode , — The mode is the value of the variable corre- 
Sj^otxdmg^to the maximum of the ideal frequency-curve which 
^ves the closest possible fit to the actual distribution 

It^ IS evident that in an ideal symmetrical distribution mean, 
median and mode coincide with the centre of symmetry. If, 
however, the distribution be asymmetrical, as in fig 22, the three 
forms of average are distinct. Mo being the mode, Mi the median, 
and M the mean Clearly, the mode is an important form of 
average m the cases of skew distributions, though the term is of 
recent introduction (Pearson, ref. 11), It represents the value 
which is most frequent or typical, the value which is in fact the 
f^hion {la mode). But a difficulty at once arises on attempting 
to determine this value for such distributions as occur m practice 
It is no use giving merely the mid-value of the class-interval into 
which the greatest frequency falls, for this is entirely dependent 



Vn. — WATERAGES. 


121 


on the choice of the scale of class-intervals It is no use makings 
the class-intervals very small to avoid error on that account, for , 
the class-frequencies will then become small and the distribution 
irregular What we want to ariive at is the mid- value of the^ 
interval for which the frequency would be a maximum, if the* 
intervals could be made indefinitely small and at the same time 
the number of observations be so increased that the class-frequen- 
cies should run smoothly. As the obseivations cannot, m a 
practical case, be indefinitely increased, it is evident that some 
process of smoothing out the irregularities that occur in the 
actual distribution must be adopted, in order to ascertain the 
approximate value of the mode. But there is only one smoothing 
process that is really satisfactory, m so far as every observation 
can be taken into account in the determination, and that is the 
method of fitting an ideal frequency-curve of given equation to 
the actual figures The value of the variable corresponding to the 
maximum of the fitted curve is then taken as the mode, m 
aocoidance with our definition Mo m fig 21 is the value of the 
mode so determined for the distribution of pauperism, the value 
2 99 being, as it happens, very neaily coincident with the centre 
of the interval in which the greatest frequency lies The deter- 
mination of the mode by this — the only strictly satisfactory — 
method must, however, be left to the more advanced student 
20. At the same time there is an appioximate relation between ' 
mean, median, and mode that appears to hold good with surprising 
closeness for moderately asymmetiical distributions, approaching I 
the ideal type of fig 9, and it is one that should be borne in | 
mind as giving — roughly, at all events — the relative values of 
these thiee averages for a great many cases with which the 
student will have to deal It is expressed by the equation — 

Mode = Mean - 3 (Mean - Median) 

That IS to say, the median lies one-tbiid of the distance from the 
mean towards the mode (compare figs 21 and 22) For the dis- 
tribution of paupeiism we have, taking the mean to three places of 
decimals, — 

Mean ..... 3*289 

Median .... 3 195 

Difference . . 0*094 

Hence approximate mode = 3*289-3x0094 
= 3 007, 

or 3 01 to the second place of decimals, which is sufficient accuracy 
for the final result, though three decimal places must be retained 
for the calculation The true mode, found by fitting an ideal 



122 


THEOEY OF STATISTICS 


distribution, is 2 99 As further illustrations of the closeness 
with which the relation may be expected to hold m diffeient cases, 
we give below the lesults for the distiibutions of pauperism in 
the unions of England and Wales in the yeais 1850, 1860, 1870, 
1881, and 1891 (the last being the illustration taken above), 
and also the results for the distribution of barometer heights at 
Southampton (Table XI, Chap VI § 14), and similar distiibu- 
tions at four other stations 

Comparison of the Approximate and Tiue Modes in the Case of Five Dis- 
tributions of Pauperism {Percentages of the Population m receipt of 
Relief) in the Unions of England and Wales (Yule, Jour Roy. Stat 
Soc , vol lix , 1896 ) 


Year. 

Mean. 

Median. 

Approximate 

Mode 

True Mode 

1850 

6 508 

6 261 

5 767 

5-815 

1869 

5 195 

5 000 

4 610 

4 657 

1870 

6 -451 

5 380 

5 238 

5-038 

1881 

3 676 

3 623 

3 217 

3 240 

1891 

3-289 

3 195 

3 007 

2 987 


Comparison of the Approximate and True Modes m the Case of Five Dis- 
tributions of the Height of the Barometer for Daily Observations at the 
Stations named. (Distiibutions given by Karl Pearson and Alice Lee, 
Phil Trans , A, vol cxc (1897), p. 423 ) 


Station 

Mean 

Median 

Approximate 

Mode 

True Mode 

Southampton 

29 981 

30-000 

30 038 

30 039 

Londonderry 

29 891 

29 915 

29 963 

29*960 

Carmarthen 

29 952 

29 974 

30 018 

80 013 

Glasgow . 

29 886 

29 906 

29 946 

29 967 

Dundee . 

29 870 

29-890 

29 930 

29 951 


It will be seen that in the case of the pauperism figures the 
approximate mode only diverges markedly from the true value 
in the year 1870, a yeai m which the frequency-distribution was 
very iriegular In all the other years the difference between the 
' true and appioximate values of the mode is hardly greater than 
the alteration that might be caused in the true mode itself by 
slight variations in the method of fitting the curve to the actual 
distribution Similar remarks apply to the second series of illus- 
trations, the true and approximate values are extremely close, 
except m the case of Dundee and Glasgow, where the divergence 
I reaches two-hundredths of an inch 

21 Summing up the preceding paragraphs, we may say that 
the mean is the form of average to use for all general purposes , 




VII. — AVERAGES. 


123 


it is simply calculated, its value is always determinate, its 
algebraic treatment is particularly easy, and in most cases it is 
rather less affected than the median by eriors of sampling. The 
median is, it is true, somewhat more easily calculated from a given 
frequency-distiibution than is the mean , it is sometimes a useful 
makeshift, and in a certain class of cases it is more and not less 
stable than the mean , but its use is undesirable in cases of discon- 
tinuous variation, its value may be indeterminate, and its algebraic 
treatment is difficult and often impossible The mode, finally, 
is a form of average hardly suitable for elementary use, owing 
to the difficulty of its determination, but at the same time it 
represents an important value of the variable The arithmetic 
mean should invariably be employed unless there is some very 
definite reason for the choice of another form of average, and the 
elementary student will do very well if he limits himself to its 
use. Objection is sometimes taken to the use of the mean in the 
case of asymmetiical frequency- distributions, on the ground that 
the mean is not the mode, and that its value is consequently 
misleading But no one in the least degree familiar with the 
manifold forms taken by frequency-distiibutions would regard the 
two as in general identical , and while the importance of the mode 
IS a good reason for stating its value in addition to that of the 
mean, it cannot replace the latter. The objection, it may be noted, 
would apply with almost equal force to the median, for, as we have 
seen (§ 20), the difference between mode and median is usually 
^^bout two-thirds of the difference between mode and mean 
i 22 The Geometric Mean , — The geometiic mean (r of a series of 
values Xj, Xg, Xg, . . X„, IS defined by the relation 

G=={X,X^X,. . xSn . . . ( 10 ) 

The definition may also be expressed in terms of logarithms, 

log(?= Is(logX) . . (11) 

-tV j, ^ 

that IS to say, the logarithm of the geometiic mean of a series of 
values IS the arithmetic mean of their logarithms ~ 

The geometiic mean of a given series of quantities is always 
less than their arithmetic mean ^ the student will find a proof in 
most text-books of algebra, and in ref 10 The magnitude of 
the difference depends largely on the amount of dispersion of the 
variable in proportion to the magnitude of the mean (c/ Chap, 
yill , Question 8) It is necessarily zero, it should be noticed, if 
even a single value of X is zero, and it may become imaginary if 
negative values occur Excluding these cases, the value of the 



124 


THEORY OF STATISTICS, 


geometric mean is always determinate and is rigidly defined The 
computation is a little long, owing to the necessity of taking 
logarithms it is hardly necessary to give an example, as the 
method is simply that of finding the arithmetic mean of the 
logm ithms of X (instead of the values of X) in accordance with 
equation (11) If there aie many obseivations, a table should be 
drawn up giving the frequency-distribution of log X, and the 
mean should be calculated as m Examples i and ii of §§ 9 and 10 
The geometiic mean has never come into general use as a repre- 
sentative average, partly, no doubt, on account of its rather 
troublesome computation, but principally on account of its some- 
what abstract mathematical character {cf § 4 (c) ) the geometiic 
mean does not possess any simple and obvious properties which 
render its general nature readily comprehensible 

23 At the same time, as the following examples show, the 
mean possesses some important properties, and is leadily treated 
algebraically in certain oases. 

{a) If the series of observations X consist of r component 
series, there being observations in the fiist, X« in the second, 
and so on, the geometric mean G of the whole series can be 
readily expressed in terms of the geometiic means frp etc , of 
the component series. For evidently we have at once (as in § 13 

Q>))- 

log ffi + iTg log{? 2 + . . +N, \r>gGr ■ (13) 


(5) The geometric mean of the ratios of coiiespondmg observa- 
tions m two series is equal to the latio of their geometiic means 
For if 

X^XJX,, 

logZ-logXj-logXg, 


then summing for all pairs of X^'s and X^’s, 

G-GJG^ . , , • 

(c) Similarly, if a variable X is given as the pioduct of any 
number of others, i e if 


X—X^X^.Xg , . X,. 

X]^, Xg, . , . X,. denoting coriesponding observations in r 

different series, the geometric mean 6^ of X is expressed m terms 
of the geometric means G^, G^, , . . G^ of X^ Xg, . by 

the relation 

G=G^ G^G^ . . Gr . . (14) 

That is to say, the geometric mean of the product is the product 
of the geometric means 



PopuZalLOTv (000 *s oi 


VII. — AVERAGES. 


125 


24 The use of the geometric mean finds its simplest application i 
in estimating the numbers of a population midway between two i 
epochs (say two census years) at which the population is known. ’ 
If nothing is known concerning the increase of the population 
save that the numbers recorded at the first census were and at 
the second census n years later the most reasonable assump- 


i801 // 21 3} 41 S3 61 71 SI 91 1903 



Census year 

Fio. 24 — Showing the Populations of certain rural counties of England 
foi each Census year fiom 1801 to 1901 


tion to make is that the percentage increase m each year has 
been the same, so that the populations m successive yeais form a 
geometric senes, F(t being the population a year after the first 
census, two years after the first census, and so on, and 

= (15) 

The population midway between the two censuses is therefore 

= = • ■ ■ ( 15 ) 



126 


THEORY OF STATISTICS. 


i.e the geometric mean of the numbers given by the two censuses. 
This result must, however, be used with discretion The rate of 
increase of population is not necessarily, or even usually, constant 
over any considerable period of time if it were so, a curve 
repiesentmg the growth of population as m fig 24 would be 
continuously convex to the base, whether the population weie 
increasing or decreasing In the diagram it will be seen that 
the curves are frequently concave towaids the base, and similar 
results will often be found for districts m which the population is 
not increasing very lapidly, and from which there is much 
emigration Further, the assumption is not self-consistent in any 
case in which the rate of increase is not uniform over the entire 
area — and almost any area can be analysed into parts which are not 
similar m this respect For if m one part of the area considered 
the initial population is Fq and the common ratio F, and in the 
remainder of the area the initial population is and the common 
ratio r, the population in year n is given by 

-Pn+iJ„ = Po^“+jPo.r”. 

This does not represent a constant rate of increase unless F^r, 
If then, for example, a constant percentage late of increase be 
assumed for England and Wales as a whole, it cannot be assumed 
for the Counties if it be assumed for the Counties, it cannot be 
assumed for the country as a whole. The student is refen ed to 
refs 14, 15 for a discussion of methods that may be used for the 
consistent estimation of populations under such ciicumstances. 

25 The pioperty of the geometric mean illustiated by equation 
(13) renders it, in some lespects, a peculiaily convenient foim of 
average m dealing with ratios, ^e “index-numbers,^' as they are 
termed, of puces Let 

Y' Y" Y'" 

0) O’ 0» * • * • 0 

Y' Y" Y'" Yn 

-A. 1, ^ 1, -d. , uA j 

Y' Y" Y'” Yn 

-A. 2) -A 2j 2» • * • 2 


denote the prices of W commodities in the years 0, 1, 2 . 

Further, let /Xq, and so on, so that 


yf yn ym 

105 10 ? 10 ? 

yt ytt yto 

205 ^ 205 ^ 20 > 



represent the ratios of the puces of the several commodities in years 
1, 2, , , to their prices m year 0 These ratios, in practice 
multiplied by 100, are termed %ndex~numbe7 s of the prices of the 
several commodities, on the year 0 as base Evidently some 



Vn. — ^AVEEAGES. 


127 


form of average of the Ps for any given year will afford an 
indication of the general level of prices for that year, provided the 
commodities chosen are sufficiently numerous and representative. 
The question is, what forpa of average to choose If the geometric 
mean be chosen, and 6^20 denote the geometric means of the 
Fs for the years 1 and 2 respectively, we have 



* x\ x'\ * • • * ; 

= (Y\, . 7’^ . 


From the first form of this equation we see that the ratio of the 
geometric mean index-number in year ^ to that in year I is 
identical with the geometric mean of the ratios for the index- 
numbers of the several commodities. A similar property does 
not hold for any other form of average . the ratio of the anthmetic 
mean index-numbers is not the same as the arithmetic mean of 


the ratios, nor is the ratio of the medians the median of the 
ratios From the second and third forms of the equation it 
appears further that the ratio of the geometric mean index- 
number in year 2 to that in year 1 is independent of the prices in 
the year first chosen as base (% e year 0), and is identical with the 
geometric mean of the index-numbers for year on year 1 as 
base Again, a similar property does not hold for any other form 
ot average If arithmetic means of the index-numbers be taken, 
for example, the ratio of the mean in year 2 to the mean m year 
1 will vary with the year taken as base, and will differ more or 
less from the arithmetic mean ratio of the prices in year 2 to the 
prices of the same commodities in year 1 \ the same statement is 
true if medians be used The results given by the use of the 
geometric mean possess, therefore, a certain consistency that is 
not exhibited if other forms of aveiage are employed. It was 
used in a classical paper by Jevons (ref. 4), though not on quite 
the same grounds, but has never been at all generally employed. 

26 The general use of the geometric mean has been suggested 
on another ground, namely, that the magmtudes of deviations 
appear, as a rule, to be dependent m some degree on the magm- 
tude of the average , thus the length of a mouse varies less than 
the stature of a man, and the height of a shrub less than that of 
a tree Hence, it is argued, variations in such cases should be 
measured rather by their ratio to, than their difference from, the 
average , and if this is done, the geometric mean is the natural 
average to use If deviations be measured in this way, a 



128 


THEOEY OF STATISTICS. 


deviation Gjr will be regarded as the equivalent of a deviation r.G, 
instead of a deviation — as the equivalent of a deviation +a;. 
If a distnbution take the simplest possible form when relative 
deviations are regarded as equivalents, the frequency of deviations 
between Gjs and Gfr will be equal to the frequency of deviations 
between r,G and s.G The frequency-cuive will then be sym- 
metiical round log G if plotted to log X as base, and if theie be 
a single mode, log G will be that mode — a logarithmic or geometric 
mode, as it might be termed G will not be the mode if the distri- 
bution be plotted in the ordinary way to values of X as base 
The theory of such a distnbution has been discussed by more than 
one author (refs 2, 8, 9) The general applicability of the assump- 
tion made does not, however, appear to have been very widely 
tested, and the reasons assigned have not sufficed to bring the 
geometric mean into common use It may be noted that, as the 
geometric mean is always less than the arithmetic mean, the 
fundamental assumption which would justify the use of the former 
clearly does not hold where the (arithmetic) mode is greater than 
the arithmetic mean, as in Tables X. and XL of the last chapter 

27. The Harmonic Mean — The harmonic mean of a series of 
quantifies is the reciprocal of the arithmetic mean of their 
reciprocals, that is, if H be the harmonic mean, 

' ' ' ■ 

The following illustration, the result of which is required for an 
example in a later chapter (Chap XIII § 11), will serve to show 
the method of calculation 

The table gives the number of htteis of mice, m certain 
breeding experiments, with given numbers (X) m the litter (Data 
from A. D. Darbishire, Biometrika, lu pp. 30, 31.) 


Number in 
Litter. 

X 

Number of 
Litters. 

/ 


1 

7 

7 000 

2 

11 

5 500 

3 

16 

5 333 

4 

17 

4 250 

5 

26 

6 200 

6 

31 

5 167 

7 

11 

1 571 

8 

1 

0 126 

9 

1 

0 111 

— 

121 

34 257 




VII — AVERAGES. 


129 


Whence, Ijll = 0 2831, 3 532. The arithmetic mean is 4 587, 

or more than a unit greater. 

If the prices of a commodity at different places or times are 
stated in the form “ so much for a unit of money,” and an average 
price obtained by taking the arithmetic mean of the quantities 
sold for a unit of money, the result is equivalent to the harmonic 
mean of prices stated m the ordinal y way Thus retail prices of 
eggs were quoted before the War as “ so many to the shilling.” 
Supposing we had 100 returns of retail prices of eggs, 50 returns 
showing twelve eggs to the shilling, 30 fourteen to the shilling, 
and 20 ten to the shilling , then the mean number per shilling 
would be 12 2, equivalent to a pnce of 0 984d per egg But 
if the prices had been quoted m the form usual for other com- 
modities, we should have had 50 returns showing a price of Id, 
per egg, 30 showing a price of 0 857d , and 20 a price of l*2d. : 
arithmetic mean 0 997d , a slightly greater value than the har- 
monic mean of 0*984 The official returns of prices in India were, 
until 1907, given m the form of ‘‘Sers (2 057 lbs) per rupee” 
The a\erage annual price of a commodity was based on half- 
monthly puces stated in this form, and ‘‘index-numbers” were 
calculated from such annual averages. In the issues of “ Prices 
and Wages m India ” for 1908 and later years the prices have 
been stated in terms of “ rupees per maund (82 286 lbs ).” The 
change, it will be seen, amounts to a replacement of the harmonic 
by the arithmetic mean price 

The harmonic mean of a series of quantities is always lower 
than the geometric mean of the same quantities, and, afortim, 
lower than the arithmetic mean, the amount of difference depend- 
ing largely on the magnitude of the dispersion relatively to the 
magnitude of the mean {Of. Question 9, Chap YIII.) 

REFERENCES. 

General. 

(1) Fechnek, G T “XJeber den Ausgangswerth der kleinsten Abweich- 

ungssumme, dessen Bestimmung, Verwendung iind Verallgemem- 
eniTig," Alh d Jcgl sachsischen Qeselhchaft d Wissenschafienj vol. 
xvm (also numbered xi of the Ahh d. math -phys Classe) ; Leipzig 
(1878), p 1 (The average defined as the ongin from which the 
dispersion, measured in one way or another, is a minimum • geometric 
mean dealt with incidentally, pp. 13-16 ) 

(2) Fechner, G. T., KolleTchvmasslehre, herausgegehen von G. P Lipps , 

Engelmann, Leipzig, 1897 (Posthumously published deals with 
fiequency distiibutions, their forms, averages, and measures of dis- 
persion in geneml ; mcludes much of the matter of (1). ) 

(3) ZizEK, Franz, DiestaUstischenMTitelweTthe, DunckerundHumblot, Leipzig, 

1908 English translation, Statistical Averages, translated with addi- 
tional notes, etc , by W M Persons, Holt & Co , New York, 1913. (Non- 
mathematical, but useful to the economic student for references cited.) 

9 



130 


THEOBY OF STATISTICS. 


The Geometric Mean 

(4) Jevons, W Stanley, A Serious Fall in the Value of Gold ascertained 

and its Soaial Effects set forth , Stanford, London, 1863 Reprinted 
in Investigations in Currency and Finance , Macmillan, London, 1884 
(The geometric mean applied to the measurement of puce changes ) 

(5) Jevons, W Stanley, “On the Variation of Prices and the Value of 

the Currency since 1782,” Jour Roy Slat, Soc , vol xxviii , 1865. 
Also reprinted in volume cited above 

(6) Edgewoeth, F Y , “On the Method of ascei taming a Change m the 

Value of Gold,”/oMr Ray Stat Soc, vol xlvi , 1883, p 714 (Some 
criticism of the reasons assigned by Jevons for the use of the geometric 
mean ) 

(7) Galton, Feancis, “The Geometiic Mean in Vital and Social Statistics,” 

Proc Roy Soc , vol. xxix , 1879, p 365 

(8) MoAlistee, Donald, “ The Law of the Geometric Mean,” ibid, p, 367. 

(The law of frequency to which the use of the geometric mean would 
be appropriate ) 

(9) Kapteyn, J. C., Shew Frequency -curves in Biology and Statistics \ 

Noordhoif, Gronmgen, and Wm, Dawson, London, 1903, (Contains, 
amongst other forms, a generalisation of McAlister’s law. ) 

(10) Craweoed, G E , “An Elementary Proof that the Aiithmetic Mean 

of any number of Positive Quantities is gieatei than the Geometric 
Mean,” Proc Edin Math. Soc,, vol. xvm , 1899-1900. 

See also refs. 1 and 2 


The Mode. 

(11) Pbaeson, Kael, “Skew Variation in Homogeneous Material,” Phil, 

Trans Roy Soc , Senes A, vol clxxxvi , 1896, p. 343 (Dehnition of 
mode, p 345 ) 

(12) Ytjle, G. U, “Notes on the History of Paupeiism in England and 

Wales, etc. Supplementary Note on the Determination of the Mode,” 
Jour, Roy Stat Soc , vol, lix , 1896, p 343 (The note deals with 
elementary methods of approximately determining the mode the one 
third rule and one other ) 

(13) Peaeson, Kael, “On the Modal Value of an Organ or Character,” 

Biometrika, vol i , 1902, p. 260 (A warning as to the inadequacy of 
mere mspection for determining the mode.) 

Estimates of Population. 

(14) W'aters, a G , “A Method for estimating Mean Populations m the 

last Intercensal Period,” Jour Roy Stat Soc , vol. Ixiv , 1901, p 293 
(16) Waters, A. C , Estimates of Population Supplement to Annual Report of 
the Reqisio ar- General for England and ^Vales (Cd 2618, 1907, p cxvii ) 
For the methods actually used, see the Reports of the Registrar -Gener oil 
of England and Wales for 1907, pp cxxxu-cxxxiv, and for 1910, 
pp xi-xii Cf Snow, ref. 11, Chap. XII , for a different method 
based on the symptoms of growth such as numbers of biiths or of houses 

Index-numbers. 

These were incidentally referred to in § 25 The general theory of 
index-numbers and the different methods in which they may be formed 
are not considered in the present work. The student will find copious 
references to the liteiature in the following — 

(16) Edgeworth, E. Y., “Reports of the Committee appointed for the 



VII — AVERAGES. 


131 


purpose of investigating the best methods of ascertaining and measunng 
V ariations in the Value of the Monetary Standard,” British Assoaatvm 
Beports, 1887 (p. 247), 1888 (p. 181), 1889 (p 133), and 1890 (p 485). 

(17) Edgiworih, F Y, Article “ Index-numbers ” m Palgrave’s Bictiomry 

of Political Economy^ vol ii. , Macmillan, 1896 

(18) Fountain, H , “Memorandum on the Construction of Index-numbers 

of Prices,” m the Boaid of Trade Report on Wholesale arid Retail 
Prices in the United Kingdom^ 1903. 

EXERCISES. 

1 Verify tbe following means and medians from the data of Table VI , 
Chap VI , p. 88. 

Stature in Inches for Adult Males in — 

England. Scotland, Wales Ireland 
Mean . . , 67*31 68*55 66 62 67 78 

Median . . . 67 35 68 48 66 56 67 69 

In the" calculation of the means, use the same arbitrary origin as in Example 
u , and check your woik by the method of § 13 (6) 

2. Find the mean weiglit of adult males in the United Kingdom from the 
data in the last column of Table IX , Chap VI , p, 95. Also bnd the median 
weight, and hence the approximate mode, by the method of § 20 

3. Similarly, find the mean, median, and appioximate value of the mode 
for the distribution of fecundity m race-horses, Table X , Chap VI , p 96 

4 Using a graphical method, find the median annual value of houses 
assessed to inhabited house duty in the financial year 1885-6 from the data 
of Table IV., Chap. VI , p. 83 

5 (Data from Saueibeck, Jour Roy Rtat. Soc., Maich 1909,) The figures 
in columns 1 and 2 of the small table below show the index-numbers (or per- 
centages) of prices of ceitain animal foods in tbe years 1898 and 1908, on 
their average prices during the years 1867-77 In column 3 have been added 
the ratios of the mdex- numbers m 1908 to the index numbers in 1898, the 
latter being taken as 100 

Fmd the average ratio of prices in 1908 to prices m 1898, taken as 100 ; — 

(1) From the arithmetic mean of the ratios in col. 3. 

(2) From the ratio of the arithmetic means of cols 1 and 2. 

(3) From the ratio of the geometric means of cols 1 and 2 

(4) From the geometnc mean of the ratios in col 3. 

Ifote that, by § 25, the last two methods must give the same result. 



Index- number of pnee in 

Ratio 

Commodity. 

1898 

1908. 

08/98. 


1 

2 

3 

1. Beef, prime 

78 

88 

112 8 

2. Beef, middling 

72 

90 

125*0 

8 Mutton, prime . 

84 

92 

109 5 

4 Mutton, middling . i 

67 

95 

141*8 

6. Pork , . 

87 

83 

95 4 

6 Bacon 

78 i 

84 

107 7 

7. Butter .... 

76 

91 ! 

119*7 




132 


theoky of statistics. 


6. (Data from census of 1901.) The table below shows the population of 
the rural sanitary di&tncts of Essex, the urban sanitary districts (other than 
the borough of West Ham), and the boiough of West Ham, at the censuses 
of 1891 and 1901. Estimate the total population of the county at a date 
midway between the two censuses, (1) on the assumption that the percentage 
rate of increase is constant for the county as a whole, (2) on the assumption 
that the percentage rate of increase is constant in each gioup of districts and 
the borough of West Ham 


Essex. 

Population. 

1891 

1901. 

Rural districts 

232,867 

240,776 

West Ham .... 

204,903 

267,358 

Other urban distiicts 

345,604 

576,864 

Total 

783,374 

1,083,998 


7 (Data fiom Agricultiiral Stahshcs for 1905, Cd 3061, 1906.) The 
following statement shows the monthly average prices of eggs in Great 
Britain in 1905, as compiled from the weekly returns of maiket prices tor 
hist and second quality British eggs, per 120 • — 


Month 

Fust 

Quality 

Second 

Quality, 

January . . 

s. d 

13 0 

3 d 

11 0 

February 

11 0 

9 0 

March . . 

8 0 

6 0 

April .... 

7 6 

6 6 

May 

8 0 

7 6 

June 

8 6 

S 0 

July 

9 6 

8 6 

August . 

11 0 

10 0 

September 

11 6 

10 6 

October 

14 0 

12 6 

November 

18 0 

16 0 

Decembei 

17 6 

15 0 

Mean for year 

11 5i 

10 Oi 


What would have been the mean piice for the year in each case if the whole- 
sale prices had been lecorded m the same way as retail pi ices, ^ e at so many 
eggs per shilling ^ State your answer in the form of the equivalent price per 
120, and obtain it in the shortest way by taking the haimonic mean of the 
above prices {cf § 27). 

8 Supposing the frequencies of values 0, 1, 2, , . . ot a variable to be 
given by the terms of the binomial series 

J”, n fr-i.p, 

where j? + g = 1 , find the mean. 





CHAPTER VIIL 

MEASURES OE DISPERSION, ETC. 

1 Inadequacy of the range as a measure of dispersion — 2-13 The standard 
deviation* its definition, calculation, and properties— 14-19 The 
mean deviation its definition, calculation, and properties— 20-24 The 
quartile deviation or semi interquartile range— 25. treasures of 
relative dispersion— 26 Measures of asymmetiy or skewness — 27--30, 
The method of grades or percentiles 

1. The simplest possible measure of the dispersion of a series of 
values of a variable is the actual range, ^ e. the difference between 
the greatest and least values obseived While this is frequently 
quoted, it is as a rule the worst of all possible measuies for any 
serious purpose There are seldom real upper and lower limits 
to the possible values of the variable, very large or very small 
values being only more or less infrequent the range is therefore 
subject to meaningless fluctuations of considerable magnitude 
according as values of greater or less infrequency happen to 
have been actually observed Note, for instance, the figuies of 
Table IX , Chap VI p 95, showing the frequency distributions of 
weights of adult males in the several parts of the United King- 
dom In Wales, one individual was observed with a weight of 
over 280 lbs , the next heaviest being under 260 lbs The 
addition of the one very exceptional individual has inci eased the 
lange by some 30 lbs , or about one-fifth A measure _subject to 
erratic alterations by casual influences in this way is clearly not 
orfnuch use for*^compaiative purposes Moreover, the measure 
takes no account of the form of the distribution within the limits 
of the range , it might well happen that, of two distributions 
covering precisely the same range of variation, the one showed 
the observations for the most part closely clustered round the 
average, while the other exhibited an almost even distribution of 
frequency over the whole range Clearly we should not regardi 
two such distributions as exhibiting the same dispeision^ thougbl 
they exhibit the same ranpe Some sort of measure of dispersior| 
IS therefore required, based, like the averages discussed in the last 

133 



134 


THEORY OF STATISTICS. 


chapter, on all the observations made, so that no single observation 
can have an undulj preponderant effect on its magnitude , indeed, 
the measure should possess all the properties laid down as desir- 
able for an average in § 4 of Chap VII There are three such 
measures in common use — the standard deviation, the mean 
deviation, and the quartile deviation or semi-mterquartile range, 
of which the first is the most important 

2 ' Th^ Standard Deviation — The standard deviation is the 
squaie root of the arithmetic mean of the squares of all deviations, 
deviations being measured from the arithmetic mean of the 
observations. If the standard deviation be denoted by cr, and a 
deviation from the arithmetic mean by as in the last chapter, 
then the standard deviation is given by the equation 

- - . . . . ( 1 ) 

To square all the deviations may seem at first sight an artificial 
procedure, but it must be lemembered that it would be useless to 
take the mere sum of the deviations, m order to obtain a measure 
of dispersion, since this sum is necessarily zero if deviations be 
taken from the mean In order to obtain some quantity that 
shall vary with the dispersion it is necessary to average the 
deviations by a process that treats them as if they were all of the 
same sign, and squaring is the simplest process for eliminating 
signs which leads'^to results of algebraical convenience 

3. A quantity analogous to the standard deviation may be 
defined in moie general terms. Let A be any arbitrary value of 
JT, and let ^ (as in Chap YII § 8) denote the deviation of X 
from A ; z e. let 

Then we may define the root-mean-square deviation s from the 
origin A by the equation 

= .... ( 2 ) 

In terms of this definition the standaid deviation is the root- 
mean-square deviation from the mean There is a very simple 
relation between the standard deviation and the root-mean-square 
deviation from any other origin Let 

M-A^d. , . , . (3) 

f=a;4-C?. 

1^2 = + 2a: d + 

+ 2(a:)-hXr/2 


so that 
Then 



nn — MEASlJRES OF BISPERkoN, ETC. 


135 


But the sum of the deviations from the mean is zero, therefore 
the second term vanishes, and accordingly 

. . (4) 

Hence the roo1>mean-square deviation is least when deviations 
are measured from the mean, the standard deviation is the least 
possible root-mean-square deviation 

or Sf/P} if we are dealing with a grouped distribution 
and f is the frequency of is sometimes termed the second moment 
of the distribution about just as S(|) or S(/f) is termed 
the first moment (c/ Chap. YII. § 8) . we shall not make use 
of the term m the present work. Generally, is termed 

the nth moment 

4. If (T and d are the two sides of a right-angled triangle, s is 


R 



the hypotenuse. If, then, ME be the vertical through the 
mean of a frequency-distribution (fig 25), and MS be set off 
equal to the standard deviation (on the same scale in which the 
variable X is plotted along the base), SA will be the root-mean- 
square deviation from the point A, This construction gives a 
concrete idea of the way in which the root-mean-square deviation 
depends on the origin from which deviations are measured It 
will be seen that for small values of d the difference of s from cr 
will be very minute, since A will lie very nearly on the circle 
drawn through M with centre S and radius SM : shght errors 
m the mean due to approximations in calculation will not, there- 
fore, appreciably affect the value of the standard deviation. 

5 If we have to deal with relatively few, say thirty or forty, 
ungrouped observations, the method of calculating the standard 
deviation is perfectly straightforward It is illustrated by the 
figures given below for the estimated average earnings of 



136 


THEORY OF STATISTICS 


agricultural labouiers m 38 rural unions The values (earnings) 
are first of all totalled and the total divided by F to give the 
arithmetic mean if, viz, 15s ll^d, or 15s lid to the neaiest 
penny. The earnings being estimates, it is not necessary to take 
the average to any higher degree of accuracy Having found 
the mean, the difference of each observation from the mean is 
next written down as in col 3, one penny being taken as the 
unit the signs are not entered, as they are not wanted, but the 
work should be checked by totalling the positive and negative 
difiPerences separately [The positive total is 300 and the 
negative 290, thus checking the value for the mean, viz 15s 
lid +10/38] 

Finally, each difference is squared, and the squares entered m 
col 4, — tables of squares are useful for such work if any of the 
differences to be squared are large (see list of Tables, p 356) 
The sum of the squares is 16,018. Treating the value taken for 
the mean as sensibly accurate, we have — 


16018 

38 


-421 5 


<r-20*5d. 


If we wish to be more precise we can reduce to the true mean 
by the use of equation (4), as follows — 


2 ^ 16,018 

J-!2.0 26S2, 


= 421 5263 
= 0 0693 


Hence 


0-2 = 52-^2 ==421 4570 
cr— 20 529d 


Evidently this reduction, in the given case, is unnecessary, 
illustrating the fact mentioned at the end of § 4, that small 
errors m the mean have little effect on the value found for the 
standard deviation The first value is correct within a very 
small fraction of a penny. 



Vin — MEASURES OF DISPERSION, ETC. 


137 


CALOtJLATiON OF THE STAND AKD DEVIATION. Example 1 —GalmloXioih of 
Mean and Standard Deviation for a Short Senes of Ohseriations un^ 
grouped Estimated Average Weekly Earnings of Agricultural Lahom ers 
in Thirty^eight Rural Unions, in 1892-S. (W. Little Labour Com- 
mission, Report, vol. v , parti , 1894 ) 


1. 



2 


3. 

4 

Union. 



Earnings 
(Shillings 
and Pence) 

Difference 

1 (Pence) 

(Difference)® 

e 




s. 

d 



1. Glendale . , 



20 

9 

58 

3,364 

2 Wigton . , 



20 

3 

52 

2,704 

3. Garstang , 



19 

8 

45 

2,025 

961 

4 Belper 



18 

6 

31 

5 Nantwicli . 



17 

8 

21 

441 

6 Atcham , 



17 

6 

19 

361 

7. Driffield . 



17 

1 

14 

19d 

8. Uttozcter . 



17 

0 

13 

169 

9 Wetlierby . 



17 

0 

13 

169 

10 Easingwold 



16 11 

12 

144 

11 Southwell 



16 

6 

7 

49 

12. Hollingbourn 



16 

4 

5 

25 

13 Melton Mowbray 



16 

3 

4 

16 

14 Truro 



16 

3 

4 

16 

15 Godstone . 



16 

0 

1 

1 

16 Louth 



16 

0 

1 

1 

17. Brixworth 



15 

9 

2 

4 

18 Crediton . 



15 

8 

3 

9 

19 Holbeach , 



15 

6 

5 

25 

20 ^Maldon 



15 

b 

1 5 

25 

21 Monmouth 



15 

4 

I 7 

49 

22 St Neots . 



15 

3 

8 

64 

23. Swaffham . 



15 

0 

11 

121 

24 Thakeham 


, 

15 

0 

11 

121 

25 Thame 



15 

0 

11 

121 

26, Thingoe 



15 

0 

11 

121 

27 Basingstoke 



15 

0 

11 

121 

28 Cirencester 



15 

0 

11 

121 

29 N Witchford , 



14 10 

13 

169 

30. Pewsey 



14 

9 

14 

196 

31 Bromyard . 



14 

9 

14 

196 

32. Wantage 



14 

9 

14 i 

196 

33 Stratford-on-Avon 



14 

7 

16 

256 

34. Dorchester 



14 

6 

17 

289 

35 Woburn 



14 

6 

17 

289 

36 Buntmgford 



14 

4 

19 

361 

37 Pershore . 



13 

6 

29 

841 

38 Langport . 



12 

6 

41 

1,681 

Total 

• 

605 

» { 

•4-300 

-290 

j- 16,018 


t 




138 


THEORY OF STATISTICS. 


The figures dealt with m this illustiation are estimates of the 
weekly earmngs of the agricultural labourers, they include 
allowances for gifts m kind, such as coal, potatoes, cider, etc. The 
estimated weekly money wages are, however, also given in the 
same Report, and we are thus enabled to make an interesting 
comparison of the dispersions of the two It might be expected 
that earnings would vary less than wages, as his earnings and not 
the mere money wages he receives are the impoitant matter to 
the labourer, and as a fact we find 

Standard deviation of weekly earnings . . 20 5d 

„ „ „ wages . 26*0d 

The arithmetic mean wage is 13s 5d 

6 If we have to deal with a grouped frequency-distribution, 
the same artifices and approximations aie used as in the calculation 
of the mean (Chap. YII §§ 8, 9, 10) The mid- value of one of 
the class-intervals is chosen as the arbitrary origin A from which 
to measure the deviations the class-inrerval is treated as a 
unit throughout the arithmetic, and all the observations within 
any one class-interval are treated as if they were identical with 
the mid- value of the interval If, as before, we denote the 
frequency in any one interval by /, these / observations con- 
tribute to the sum of the squares of deviations and we 
have — 

The standard deviation is then calculated from equation (4). 

7 The whole of the work proceeds naturally as an extension of 
that necessary for calculating the mean, and we accordingly use 
the same illustrations as in the last chapter. Thus in Example 
11 below, cols. 1, 2, 3, and 4 are the same as those we have already 
given in Example i of Chap VII. for the calculation of the mean. 
Column 5 gives the figures necessary for calculating the standard 
deviation, and is derived directly from col. 4 by multiplying the 
figures of that column again by | Thus 90 x 5 = 450, 192 x 4 = 
768, and so on The work is therefore done very rapidly The 
remaining steps of the arithmetic are given below the table , the 
student must be careful to remember the final conversion, if 
necessary, from the class-interval as unit to the natural unit 
of measurement, *In this case the value found is 2 48 class- 
intervals, and the class-interval being half a unit, that is 1 24 
per cent, 



VIII. — MEASUBKS OF DISPERSION, ETC. 


139 


Calculation of tub Stanuabu Deviation: Exam-pU ii —Calculation of 
the standard Deviation of the Percentages of the Pojpulation in receipt of 
Relief in addition to the Mean, from the figures of Table Fill of 
Chap VI {Cf the woik for the mean alone, p 111 ) 


(1) 

Percentage 
m receipt 
of Relief 

(2) 

Frequency 

/. 

(3) 

Deviation 
from Value 

(4) 

Product 

A 

(5) 

Product 

ff 

1 

18 

- 5 

90 

450 

1 5 

48 

- 4 

192 

768 

2 

72 

- 3 

216 

648 

2 6 

89 

- 2 

178 

356 

3 

100 

- 1 

100 

100 

3 5 

90 

0 

-776 

— 

4 

75 

+ 1 

75 

75 

4 5 

60 

4- 2 

120 

240 

5 

40 

+ 3 

120 

360 

5 5 

21 

+ 4 

84 

336 

6 

11 

+ 5 

65 

275 

6 5 

5 

+ 6 

30 

180 

7 

1 

+ 7 

7 

49 

7*6 

1 

+ 8 

8 

64 

8 


+ 9 

— 

— 

8 5 

*1 

+ 10 

10 

100 

Total 

632 

— 

+ 509 

4001 


From previous work, p 111, M - A=d=: -0*4225 class-intervals 


,=6 3307 


2 (/| 2)_4001 
N 632’ 

3307 -( 4225)8 
= 6 1522 

* *, <r = 2 48 intervals = 1 *24 per cent. 


To illustrate again the value of the standard deviation for 
purposes of comparison, figures are given below showing the 
means and standard deviations of similar distributions for a series 
of years from 1850 It will be seen that not only did the mean 
decrease during the period, but the standard deviation decreased 
to an equally marked extent, having been halved between 
1850 and 1891 ; the average was lowered, and at the same time 
the percentages of the population in receipt of relief clusteied 
much more closely round the lower average. 




140 


THEORY OE STATISTICS. 


Means and Standard Deviations of the Distributions of Pauperism (Pe)centagc 
of the Population in receipt of Poor daw Relief) in the Unions of England 
and Wales since 1850 (Fiom Yule, Jour Roy, Stat Soc,^ vol. hx , 
1896, figures slightly amended ) 


Year. 

Percentage of the Population 
in receipt of Relief 

Anthmetic 

Mean 

Standard 

Deviation 

1850 

6 51 

2 50 

1860 

5 20 

2 07 

1870 

5-45 

2 02 

1881 

3 68 

1 36 

1891 

3*29 

1 24 


8. In the table given on p 141 (Example iii ), the calculation of 
the standard deviation is similarly shown for the distribution of 
the statures of adult males m the British Isles, the work being 
continued from the stage which it reached for the calculation of 
the mean in Example ii of Chap VII The steps of the arith- 
metic hardly call for further explanation, but it may be noted that 
the class-interval being a unit m this case, no conversion of 
* the standard deviation from class-intervals to units is required 

9 The student must remember, as in the case of the calculation 
of the mean, that the treatment of all values within each class- 
interval as if they were identical with the mid-value of the interval 
IS an approximation and no more (c/. Chap YII § 11), though, 
for a distribution of the symmetrical or moderately asymmetrical 
type with a class-interval not greater than one-twentieth or so 
of the range, the approximation may be a very close one. But 
while the value of the arithmetic mean may be either increased 
or decreased by grouping, in the case of distributions which are 
not more than slightly asymmetrical, the standard deviation of 
such distributions tends to be increased, and the increase is the 
greater the cruder the grouping. We give an approximate 
correction for this effect later (Chap. XI § 4) The student is 
recommended to test for himself the effect of grouping in two 
or three cases 

10. It IS a useful empirical rule to remember that a range of 
six times the standard deviation usually includes 99 per cent or 
more of all the observations in the case of distributions of the 
symmetrical or moderately asymmetrical type. Thus in Example 




VIII. — MEASURES OF DISPERSION, ETC. 


141 


Calculation of the Standard Deviation * Example iii,—Cdlculatim 
of the Standard Deviation of Stature of Male Adults in the British Isles 
from the figures of Table VI ^ jp 88 {Cf p. W2 for the calculation of 
mean alone ) 


(1) 

Height 

Inches 

(2) 

Frequency 

/ 

( 3 ) 

Deviation 
from 
Value A 

(4) 

Product. 

/I 

( 5 ) 

Product 

57- 

2 

-.10 

20 

200 

58- 

4 

- 9 

36 

324 

59- 

14 

- 8 

112 

896 

60- 

41 

- 7 

287 

2,009 

61- 

83 

- 6 

498 

2,988 

62- 

169 

- 5 

845 

4,225 

63- 

394 

- 4 

1576 

6,304 

64- 

669 

- 3 

2007 

6,021 

65- 

990 

- 2 

1980 

3,960 

66- 

1223 

~ 1 

1223 

1,223 

67- 

1329 

0 

j -8584 

— 

68- 

1230 

+ 1 

1230 

1,230 

69- 

1063 

+ 2 

2126 

4,252 

70- 

646 

+ 3 

1938 

5,814 

71- i 

392 

+ 4 

1568 

6,272 

72- ' 

202 

+ 5 

1010 

5,050 

73- j 

79 

+ 6 

474 

2,844 

74- 1 

32 

+ 7 

224 

1,568 

75- i 

16 

+ 8 j 

128 

1,024 

76- 1 

5 

+ 9 1 

45 ! 

405 

77- 

2 

+ 10 

20 ? 

200 

Total 

8585 

— 

+ 8763 

66,809 


From previous work, M - A=d= + 0209 class-intervals or inches. 

5^ = 6 6172 
N 8585 ^ * 

<r‘^ = 6 6172 ~( 0209)2 
= 6 6168. 

. a- = 2*57 class-intervals or inches. 


11 the standard deviation is 1 24 per cent , six times this is 7*44 
per cent , and a range from 0 75 to 8 19 per cent includes all 
but one observation out of 632 In Example iii the standard 
deviation is 2 57 in., six times this is 15 42 in , and a range from, 
say, 60 in. to 75 4 in includes all but some 37 out of 8585 
individuals, ^.e about 99 6 per cent This rough rule serves to 




142 


THEOEY OF STATISTICS. 


give a more definite and concrete meaning to the standard 
deviation, and also to check arithmetical work to some extent — 
sufficiently, that is to say, to guard against very gross blunders 
It must not be expected to hold for short series of observations 
in Example i , for instance, the actual range is a good deal less 
than SIX times the standard deviation. 

11. The standard deviation is the measure of dispersion which 
it IS most easy to treat by algebraical methods, resembling in this 
, respect the arithmetic mean amongst measures of position The 
majority of illustrations of its treatment must be postponed to a 
later stage (Chap XI ), but the work of § 3 has already served as 
one example, and we may take another by continuing the work of 
§ 13 (h), Chap VII In that section it was shown that if a series 
of observations of which the mean is M consist of two component 
series, of which the means are and respectively, 

Vj and iTg being the numbeis of observations in the two com- 
ponent senes, and + the number in the entire series 

Similarly, the standard deviation o- of the whole series may be 
expressed in terms of the standard deviations cr^ and ( 7.2 of the 
components and their respective means Let 

ifg dy 

Then the mean-square deviations of the component series about 
the mean are, by equation (4), cr-^ -\-d^ and + 
tively. Therefore, for the whole senes, 

If the numbers of observations in the component series be equal 
and the means be coincident, we have as a special case — 

- 0-2 ^(o-j2 ^ q.^2) ^ . . • (^) 

80 that in this case the square of the standard deviation of the 
whole series is the arithmetic mean of the squaies of the standard 
deviations of its components 

It IS evident that the form of the relation (5) is quite general 
if a series of observations consists of r component series with 
standard deviations cr^^, a-g, . . <r„ and means diverging from the 

general mean of the whole series by dp dg, . d„ the standaid 
deviation cr of the whole series is given (using m to denote any 
subscript) by the equation — 

+ ... ( 7 ) 



VIII — MEASURES OF DISPERSION, ETC. 


143 


Again, as in § 13 of Chap. VII , it is convenient to note, for the 
checking of arithmetic, that if the same arbitrary origin be used 
foi the calculation of the standard deviations in a number of 
component distributions we must have 

= + . ( 8 ) 
12. As another useful illustration, let us find the standard 
deviation of the first -V natural numbers. The mean in this case 
IS evidently Further, as is shown in any elementary 

Algebra, the sum of the squares of the first iT natural numbers is 

ir(ir+ l)(2ir+ 1) 

6 


The standard deviation <r is therefore given by the equation — 

= l)(2ir+ 1) 1)2, 

that is, cr2 = J^(ir2-l) .... (9) 

This result is of service if the relative merit of, or the relative 
intensity of some character m, the different individuals of a series 
IS recorded not by means of measurements, e g, marks awarded on 
some system of examination, but merely by means of their 
respective positions when ranked m order as regards the character, 
in the same way as boys are numbered m a class With N 
individuals there are always N ranks^ as they are termed, 
whatever the character, and the standard deviation is therefore 
always that given by equation (9) 

Another useful result follows at once from equation (9), namely, i 
the standard deviation of a frequency-distnbution in which all 
values of X within a range ± 1/2 on either side of the mean are 
equally frequent, values outside these limits not occurring, so that 
the frequency-distribution may be represented by a rectangle The 
base I may be supposed divided into a very large number X of equal 
elements, and the standard deviation reduces to that of the first W 
natural numbers when X is made indefinitely large The single 
unit then becomes neghgible compared with W, and consequently 


12 


• ( 10 )' 


13 It will be seen from the preceding paragraphs that the 
standaid deviation possesses the majority at least of the properties 
which are desirable in a measure of dispersion as in an average 
(Chap VII 4) It IS rigidly defined, it is based on all the 
observations macie , it is calculated with reasonable ease , it lends 
itself readily to algebraical treatment , and we may add, though the 
student will have to take the statement on trust for the present, 
that it is, as a rule, the measure least affected by fluctuations of 



144 


THEORY OF STATISTICS. 


sampling On the other hand, it maj be said that its general 
nature is not very readily comprehended, and that the process of 
squaring deviations and then taking the square root of the mean 
seems a little involved The student will, however, soon surmount 
this feeling after a little practice m the calculation and use of the 
constant, and will realise, as he advances furthei, the advantages 
that it possesses Such root-mean-square quantities, it may be 
added, frequently occur m other branches of science The 
standard deviation should always be used as the measure of disper- 
sion, unless there is some very definite reason for preferring another 
measure, just as the arithmetic mean should be used as the measure 
of position It may be added here that the student will meet with 
the standard deviation under many different names, of which we 
have adopted the most recent (due to Pearson, ref 2) many of 
the earliei names are hardly adapted to general use, as they bear 
evidence of their derivation from the theory of errors of observation 
Thus the terms “mean error” (Gauss), “error of mean square” 
(Airy), and “ mean square error ” have all been used in the same 
sense The standard deviation multiplied by the square root of 
2 has been termed the “ modulus ” (Any), — the student will see 
later the reason for the adoption of the factor— and the recipiocal 
of the modulus the “piecision” (Lexis) For the square of the 
standard deviation, often lequired, E A Fisher has suggested 
the teim “ vaiiance ” 

1 4 I The Mean DeviaUon — The mean deviation of a series of 
values of a variable is the arithmetic mean of their deviations 
from some average, taken without regard to their sign The 
deviations may be measured either from the aiithmetic meaner 
from the median, but the latter is the natural origin to use J ust 
as the loot-mean-square deviation is least when deviations are 
measuied from the arithmetic mean, so the mean deviation is 
least when deviations are measuied from the median. For 
suppose that, for some origin exceeded by m values out of A, the 
mean deviation has a value A Let the origin be displaced by 
an amount c until it is just exceeded by m - 1 of the values only, 
i e. until it coincides with the mth value from the upper end of 
the senes By this displacement of the origin the sum of devia- 
tions in excess of the origin is reduced by m c, while the sum of 
deviations in defect of the mean is increased by {N — m)c The 
new mean deviation is therefore 

^ ^ {N — m)c — mc 

^ -W 
« A -h 2m)c. 



VIII. — MEASURES OF DISPERSION, ETC. 


145 


The new mean deviation is accordingly less than the old so long as * 

That is to say, if ISf be even, the mean deviation is constant for 
all origins within the range between the iV 72 th and the (ir/2 -t- l)th 
observations, and this value is the least if M be odd, the mean 
deviation is lowest when the origin coincides with the (ir+ l)/2th 
observation The mean deviation is therefore a minimum when 
deviations are measured fiom the median or, if the latter be 
indeterminate, from an origin within the range in which it lies. 

15 The calculation of the mean deviation either from the mean 
or from the median for a senes of ungrouped observations is very 
simple Take the figures of Example i (p 137) as an illustration.! 
We have already found the mean (15s lid to the nearest penny), i 
and the deviations from the mean are written down in column 3 
Adding up this column without respect to the sign of the devi- 
ations we find a total of 590 The mean deviation from the mean 
IS therefore 590/38 = 15 53d The mean deviation from the 
median is calculated in precisely the same way, but the median 
replaces the mean as the origin from which deviations are measuied 
The median is 15s 6d The deviations in pence run 63, 57, 50, 

36, and so on, their sum is 570, and, accordingly, the mean 

^deviation from the median is 15d exactly 

16 In the case of a grouped frequency-distribution, the sum 
of deviations should be calculated first from the centre of the 
class-interval in which the mean (or median) lies, and then 

« reduced to the mean i as origin Thus in the case of Example ii. 
the mean is 3 29 per cent and lies in the class-interval centring 
lomid 3 5 pei cent We have already found that the sum of 
deviations m defect of 3 5 per cent is 776, and of deviations in 
excess 509 : total (without regard to sign) 1285, — the unit of 
rneasuiement being, of course, as it is necessary to remember, the 
class-interval If the number of observations below the mean is 
JSf^ and above the mean Wg, and M - A-d, as before, we have to 
add Wj d to the sum found and subtract Wg d In the present 
case W^ = 327 and 2V2 = 305, while d=-0 42 class-intervals, 
therefore 

d(Wi-W 2 )= -042x 22= -9*2, 

and the sum of deviations from the mean is 1285 -9*2 = 1275 8 
Hence the rnean deviation from the mean is 1275 8/632 = 2 019 
class-intervals, ^dr T Ol'pef" cent 

17 The mean deviation from the median should be found in 
precisely similar fashion, but the mid-value of the interval in 
which the median (instead of the mean) lies should, for con- 



146 


THEORY OF STATISTICS. 


venience, be taken as origin. Thus in Example ii. the median is 
(Chap VIL § 15) 3 195 per cent Hence 3*0 per cent, should be 
taken as the origin, c?= 4- 0 39 intervals, ~ 327, 2^2 = 305. The 
deviation-sum with 3 0 as origin is found to be 1263, and the 
correction is + 0 39 x 22 = + 8 6 Hence the mean deviation 
from the median is 2 012 intervals, or again 1 01 per cent. The 
value is laally smaller than that of the mean deviation fiom the 
arithmetic mean, but the difference is too slight to affect the 
second place of decimals. 

It should be noted that, as in the case of the standard deviation, 
this method of calculation implies the assumption that all the 
values of X within any one class-interval may be treated as if 
they were the mid-value of that interval This is, of course, an 
approximation, but as a rule gives results of amply sufficient 
accuracy for practice if the class-interval be kept reasonably small 
(ef again Chap YI. § 5) We have left it as an exercise to the 
student to find the correction to be applied if the values m each 
interval are treated as if they were evenly distributed over the 
interval, instead of concentrated at its centre (Question 7) 

18 The mean deviation, it will be seen, can be calculated rather 
more rapidly than the standard deviation, though in the case of a 
' grouped distribution the difference in ease of calculation is not 
great It is not, on the other hand, a convenient magnitude foi 
algebraical treatment ^ for example, the mean deviation of a dis- 
tribution obtained by combining several others cannot m general 
be expressed in terms of the mean deviations of the component 
1 distributions, but depends upon their forms As a rule, it is more 
f affected by fluctuations of sampling than is the standard deviation, 
!but may be less affected if large and eiiatic deviations lying 
I somewhat beyond the bulk of the distribution are liable to occur, 
^his may happen, for example, in some forms of experimental 
work, and m such cases the use of the mean deviation may be 
slightly preferable to that of the standard deviation. 

19. It IS a useful empirical rule for the student to remember 
that for symmetrical or only model ately asymmetrical distri- 
butions, approaching the ideal forms of figs. 5 and 9, the mean 
^jiation is usually very nearly four-fifths of the standard devia^ 
ti&.^ Thus for the distribution of pauperism we have 

mean deviation ^ — Q 81 

standard deviation 1 *24 

In the case of the distribution of male statures in the British 
Isles, Example ni., the ratio found is 0 80. For a short series of 
observations like the wage statistics of Example i a regular result 
could hardly be expected the actual ratio is 15 0/20 5 = 0 73. 



— MEASURES OF DISPERSION, ETC. 


147 


We pointed out in § 10 that in distnbutions of the simple forms 
referxed to, a range of six times the standard deviation contains 
over 99 per cent of all the observations. If the mean deviation 
be employed as the measure of dispersion, we must substitute a 
range of 7+ times this measure 

20 I The Quartile Deviation or Semi-znterqvm tile Range . — If a ^ 
value ^I5f 'IKe>ambIe be determined bT sucb^ magnitude ““’that 
one-quarter of all the values observed are less than and thiee- 
quarters greater, t hen is termed the lower quartile Similarly, 
if a value be determined such that three-quarters of all the 
values observed are less than and one-quaxter only greater, 
then §3 is termed the upper quartile. The two quai tiles and the 
median divide the observed values of the variable into ^our 

^classes of equal frequ ency. If Mt be the value of the median, in 
asySSelnear^IsJnBuf^ 

Ml - == §3 - 

and the difference may be taken as a measure of dispersion. But 
as no distribution is iigidly symmetrical, it is usual to take as the 
measure 

and Q is termed the quaitile deviation, or better, the senn- 
mterquartiie lange — it is not a measure of the deviation fiom 
any partTciilii average the old name probable e^ior should be 
confined to the theory of sampling (Chap. XV. § 17) 

21 In the case of a shoit series of ungrouped observations 
the quartiles are determined, like the median, by inspection 
In the wage statistics of Example i , for instance, there aie 
38 observations, and 38/4 = 9 5* What is the lower quartile? 
The student may be tempted to take it halfway betw^een the 
ninth and tenth observations from the bottom of the list , 
but this would be wrong, for then theie would be nine 
observations only below the value chosen instead of 9*5 The 
quartile must be taken as given by the tenth observation 
itself, which may be regarded as divided by the quartile, and 
falling half above it and half below Therefore 

Low^er quartile = 14s lOd. 

Upper quartile Q^~ 16s. lid. 

and = 

A 

22. In the case of a grouped distribution, the quartiles, like 
the median, are determined by simple arithmetical or by 



148 


THEORY OF STATISTICS. 


graphical interpolation (c/ Chap YU §§15, 16). Thus for the 
distribution of pauperism, Example ii , we have 

632-4= 158 

Total frequency under 2 25 pei cent =138 


Difference == 20 
Frequency in mteival 2 25 - 2 75 = 89 


20 

Whence ^i = 2 25 4-ggx05 

Similarly we find 

Hence 


^3 "" ^1 _ 
^ 2 


= 2 362 per cent. 
= 4 130 
0 884 


It IS left to the student to check the value by graphical 
inteipolation 

23 For distributions approaching the ideal forms of figs. 
5 and 9, the semi-mterquartile range is usually about two-thirds 
of the standard deviation. Thus foi Example ii we find 




G^0 884 

0*“ 1 24 


= 071. 


The distribution of statures, Example iii , gives the ratio 0 68 
The short senes of wage statistics in Example i could not be 
expected to give a result in veiy stuct conformity with the 
rule, but the actual ratio, viz 0 61, does not diverge greatly 
It follows fiom this latio that a range of nine times the semi- 
inteiquartile range, approximately, is lequired to cover the same 
propoition of the total fiequency (99 per cent or more) as a range 
of SIX times the standard deviation 

24 Of the three measures of dispersion, the semi-interquartile 
range has the most clear and simple meanmg"*^Tt^s" caToiiIaCed, 
KEe the mediaEi^'^wTtlrgrea^ may be found, 

if necessary, by measuring two individuals only If, eg ^ the 
dispersion as well as the average stature of a group of men 
IS required to be determined with the least possible expenditure 
of time, they may be simply ranked in order of height, and the 
three men picked out for measurement who stand m the centre 
and one-quarter from either end of the rank This measure of 
dispel Sion may also be useful as a makeshift if the calculation 
of the standard deviation has been rendered difficult or impossible 
owing to the employment of an irregular classification of the 
frequency or of an indefinite terminal class Such uses are, 
however, a little exceptional, and, generally speaking, the 



Till —MEASURES OF DISPERSION, ETC. 


149 


semi-interquartile range as a measure of dispersion is not to be j 
recommended, unless simplicity of meaning is of primary im- 
portance, owing to the lack of algebraical conyemence which ' 
it shaies with the median Further, it is obvious that thes 
quartile, hke the median, may become indeterminate, and that 
the use of this measure of dispersion is undesirable in cases of 
discontinuous variation the student should refer again to the 
discussion of the similar disadvantage in the case of the median, 
Chap VII. § 14 It has, however, been largely used in the past, 
particularly for anthropometric work 

Measmes of Relatwe Dupeysion .^ — As was pointed out m 
Cnap^eFVTT relative size is regarded as influencing not only 

the average, but also deviations from the average, the gy me tri e 
inean seems the natural form of average to use, and deviations 
siio3rErmeasur^**ByTE^F1ra^io^^ mean As 

already stated, however, this method of measuring deviations, with 
its accompanying employment of the geometric mean, has never 
come into general use It is a much more simple matter to allow 
for the influence of size by taking the ratio of the measure of 
absolute dispersion {eg standaid deviation, mean deviation, or 
quartile deviation) to the average (mean oi median) from which 
the deviations were measured Pearson has termed the quantity 


v-lOO- 


J/I 


t e the percentage ratio of the standard deviation to the arithmetic 
mean, the coefficient of variat ion (re f 7), and has used it, for 
example, ln*Tompanii^lffierreraEr^^ of corresponding 

organs or characters in the two sexes, the ratio of the quartile 
deviation to the median has also been suggested (Verschaeffelt, 
ref 8) Such a measure of relative dispersion is evidently a mere 
number, and its magnitude is independent of the units of 
measurement employed. 

26 Measures of Asymmetry or Shewm ss — If we have to compare 
a 8eri^*WlffiW5iStidn^^ of asymmetry, or skew 

ness, as Pearson has termed it, some numerical measure of this^ 
character is desirable Such a measure of skewness should^ 
obymusly be^ independent of the units in which we measure the 
variable — e y. the skewness of the distribution of the weights of a 
given set of men should not be dependent on our choice of the 
pound, the stone, or the kilogramme as the umt of weight — and 
the measure should accordingly be a mere number Thus the 
diffeience between the deviations of the two quai tiles on either 
side of the median indicates the existence of skewness, but to 
measure the degree of skewness we should take the ratio of this 



160 


THEORY OF STATISTICS, 


difference to some quantity of the same dimensions, eg. the semi- 
mterquartile range Oui measure would then be, taking the 
skewness to be positive if the longer tail of the distribution runs 
m the direction of high values of X, 

skewness = ^ 9i + . (n) 

This would not be a bad measure if we were using th^ quartile 
deviation as a measure of dispersion its lowest value is zero, 
when the distribution is symmetrical , and while its highest possible 
value 18 2, it would rarely m practice attain higher numerical 
values than ±1 A similar measure might be based on the mean 
deviations in excess and in defect of the mean There is, however, 
only one generally recognised measure of skewness, and that is 
Pearson^s measure (ref 9) — 


skewness = 


mean - mode 
standaid deviation 


( 12 ) 


This is evidently zero for a symmetrical distribution, in w^hich 
mode and mean coincide No upper limit to the ratio is apparent 
from the formula, but, as a fact, the value does not exceed unity for 
frequency-distributions resembling geneially the ideal distributions 
of fig. 9 As the mode is a difficult form of average to determine 
by elementary methods, it may be noted that the numerator of the 
above fraction may, m the case of frequency-distiibutions of the 
forms referred to, be replaced approximately by 3(mean - median), 
(c/ Chap VII §20) The measure (12) is much more sensitive 
than (11) for moderate degrees of asymmetry. 

27 ' The Method of Percentiles may conclude this chapter 
by desonbnrg bfT^y'a m^Hod that has been largely used in the 
past in lieu of the methods dealt with m Chapters VI and VII , 
and the preceding paragiaphs of this chapter, for summarising 
such statistics as we have been considering. If the values of the 
variable (variates, as they are sometimes termed) be ranged in 
order of magnitude, and a value P of the variable be determined 
such that a percentage p of the total frequency lies below it and 
100 above, then P is termed a percentile If a series of per- 
centiles be determined for short intervals, eg 6 per cent or 10 
per cent, they suffice by themselves to show the general form 
of the distribution This is Sir Fiancis Galton's method of 
percentiles The deciles, or values of the variable which divide 
the total frequency into ten equal parts, form a natural and 
convenient senes of percentiles to use. The fifth decile, or value 
of the variable which has 50 per cent, of the observed values 



Yin — MEASURES OF DISPERSION, ETC. 


151 


aboYe it and 50 per cent below, is tbe median • the two quartiles / 
lie between the second and third an3*’?l^*"**seventh and eighth ^ 
deciles respectively 

28 The deciles, like the median and quartiles, may be 
determmed either by arithmetical or by graphical inter^Mfoia, 
excluding the cases in which, like the former constants, they 
become indeterminate {ef § 24) It is hardly necessary to give 
an illustration of the former process, as the method is precisely 
the same as for median and quartiles (Chap YII. § 15, and above, 

§ 22). Fig. 26 shows, of course on a very much reduced scale, the 



Fio 26. — Curve showing the number of Districts of England and Wales in 
which the Pauperism on 1st January 1891 did not exceed any given per- 
centage of the population (same data as Pig. 10, p. 92) graphical 
determination of Deciles. 

curve used for obtaining the deciles by the graphical method in 
the case of the distribution of pauperism (Example li. above). 
The figures of the original table are added up step by step from 
the top, so as to give the total frequency not exceeding the upper 
limit of each class-interval, and ordinates are then erected to a 
horizontal base to represent on some scale these integrated 
freguencieB : a smooth curve is then drawn through the tops of 
the ordinates so obtained This curve, as will be seen from the 
figure, rises slowly at first when the frequencies are small, then 
more rapidly as they increase, and finally turns over again and 
becomes quite flat as the frequencies tail off to zero The deciles 



152 


THEORY OF STATISTICS, 


may be readily obtained from such a ciiive by dividing the 
terminal ordinate into ten equal parts, and projecting the points 
so obtained horizontally across to the curve and then vertically 
down to the base The constiuction is indicated on the figure for 
the fourth decile, the value of which is approximately 2*88 per cent 
29 The curve of fig 26 may be drawn in a different way by 
taking a horizontal base divided into ten or a hundred equal 
parts (grades, as Sir Francis Galton has termed them), and erecting 
at each point so obtained a vertical proportional to the cor- 
responding peicentile This gives the curve of fig 27, which was 
obtained by merely redrafting fig 26. The curve is of so-called 


O JO 20 30 40 50 60 70 80 90 iOO 



7 

e 

5 

4 

3 

2 

1 

0 


Grades 


Fig 27.— -The cm ve of Fig 26 lediawn so as to give the Pauperism 
corresponding to each grade Gal ton’s * ‘ Ogive ” 


Ogive form The ogive curve for the distiibution of statures 
(Example m.) is shown for comparison in fig 28 It will be noticed 
that the ogive curve does not bring out the asymmetry of the 
distribution of pauperism nearly so clearly as the frequency- 
polygon, fig 10, p 92 

30 The method of percentiles has some advantages as a method 
of representation, as the meaning of the various peicen tiles is so 
simple and readily understood An extension of the method to 
the treatment of non-measuiable characters has also become of 
somelmpoHance‘"T^r ex^ bapacify of the different boys 

m a class as regards some school subject cannot be diiectly 
measured, but it may not be very difficult for the master to 



VIII — MEASURES OE DISPERSION, ETC. 


153 


arrange them in order of merit as regards this character if the 
boys are then “ numbered up in oidei, the number of each boy, 
or his rank, serves as some sort of index to his capacity {cf the 
remarks in § 12. It should be noted that lank m this sense is 
not quite the same as grade, if a boy is tenth, say, from the 
bottom m a class of a hundred his grade is 9*5, but the method 
IS m principle the same with that of grades or percentiles). 
The method of ranks, grades, or percentiles m such a case may 
be a very serviceable auxiliary, though, of course, it is better if 
possible to obtain a numerical measure But if, m the case of a 
measurable character, the percentiles are used not merely as 



Fig. 28 — Ogive Curve for Stature, same data as Fig 6, p 89 


constants illustrative of certain aspects of the frequency-distribu' 
tion, but entirely to replace the table giving the frequency- 
distribution, serious inconvenience may be caused, as the 
application of other methods to the data is barred Given the 
table showing the frequency-distribution, the reader can calculate 
not only the percentiles, but any form of average or measure of 
dispersion that has yet been proposed, to a sufficiently high 
degree of approximation But given only the percentiles, or at 
least so few of them as the nine deciles, he cannot pass back to 
the frequency-distribution, and thence to other constants, with any 
degree of accuracy In all cases of published work, therefore, 
the figures of the frequency-distribution should be given , they 
are absolutely fundamental 



154 


THEOKY OF STATISTICS. 


REFERENCES. 

General. 

(1) Fechnek, G. T , Ueberden Ausgangswerth derkleinsten Abweichungs- 

summe, dessen Bestirnmung, Verwendung und Veiallgememerung,” 
Ahh, d. kgl sacks Ges, d Wissemckaften^ vol xviii (also numbered 
Tol XI of the ^6^ d, math ^phys Classe ) ; Leipzig, 1878, p 1. 

Standard Deviation. 

(2) Peaeson, Kakl, “Contributions to the Mathematical Theory of Evolution 

(i On the Dissection of Asymmetrical Frequency-cui ves),” Phil. Trans 
Roy. Soc., Series A, vol clxxxi , 1894, p. 71. (Introduction of the 
term “ standard deviation,” p 80.) 

Mean Deviation 

(3) Laplace, Pieere Simon, Marquis de, Tkiorie analytiguc des prolahiU 

Us 2”^ suppUment^ 1818. (Proof that the mean deviation is a 
minimum when taken about the median ) 

(4) Trachtenberg, M. I., “A Note on a Property of the Median,” t/owr 

Roy Stat Soc , vol Ixxviii., 1915, p. 454 (A very simple proof of 
the same property.) 

Method of Percentiles, including Quartiles, etc. 

(5) Galton, Francis, “Statistics by Intel comparison, with Remarks on the 

Law of Frequency of Error,” Phil. Mag , vol xlix (4th Senes), 1876, 
pp 33-46 

(6) Galton, Francis, Natural Inheritance , Macmillan, 1889 (The method 

of percentiles is used throughout, with the quartile deviation as the 
- measure of dispersion ) 

Relative Dispersion. 

(7) Pearson, Karl, “ Regression, Heredity, and Panmixia,” Phil Trans 

Roy Soe , Senes A, vol clxxxvii , 1896, p 263 (Introduction of 
“ coefficient of vanation,” pp 276-7.) 

(8) Versohaeffelt, E, “Ueber graduelle Vanabilitat von pflanzlichen 

Eigensohaften,” deutsch hot. Gt^ , Bd. xu., 1894, pp 350-65 

Skewness 

(9) PEARbON, Karl, “ Skew Variation m Homogeneous Material,” Phil 

Trans Roy Soc , Senes A, vol. clxxxvi., 1896, p 343. (Intioduction 
of term, p. 370.) 

Calculation of Mean, Standard-deviation, or of the General 
Moments of a Grouped Distribution 

"We have given a direct method that seems the simplest and best for 
the elementary student A process of successive summation that has 
some advantages can, however, he used instead The student will 
find a convenient description with illustrations in — . 

(10) Elberton, W Palin, Frequency-curves and Correlation ; C & E. 
Layton, London, 1906 



Vni. — MEASUKES OF DISPERSION, ETC. 


155 


EXERCISES. 

1 Yerify the following from the data of Table VL, Chap YL, continuing 
the work from the stage reached for Qu 1, Chap YII. 



Stature in Inches for Adult IVIales born m — 


England 

Scotland 

Wales. 

Ireland 

Standard deviation , 

2 56 

2 50 

2 35 

2 17 

Mean deviation 

2 05 

1 95 

1 82 

1*69 

Quartile deviation 

1 78 

1*56 

1 46 

1*35 

Mean deviation / standard 
deviation 

Quartile deviation/standard 
deviation 

Lower quartile . 

0-80 

0 78 

0*78 

0-78 

0-69 

0 62 

0 62 

0*62 

65 55 

66 92 

' 65 06 

66*39 

Upper „ . . . 

69 10 

70 04 

67 98 

69 10 


2 (Contmuing from Qu 2, Chap YII ) Find the standard deviation, 
mean deviation, quaitiles and quartile deviation (or semi inteiquartile range) 
for the distribution of weights of adult males in the United Kingdom given in 
the last column of Table IX , Chap YI 

Compare the ratios of the mean and quartile deviations to the standard 
deviation with the ratios stated in §§ 19 and 23 to be usual 

Find the value of the skewness (equation 12), usmg the approximate value 
of the mode 

3 Using, or extending if necessary, your diagram for Question 4, Chap YII. , 
find the quartile values for houses assessed to inhabited house dul^ in 1885-6, 
from the data of Table lY., Chap YL 

Find also the 9th decile (the value exceeded by 10 per cent of the houses 
only) 

4 Yenfy equation (9) by direct calculation of the standard deviation of the 
numbers 1 to 10 

5 (Data from Sauerbeck, Jour Eoy StaU Soc , March 1909 ) The 
followmg are the index-numbers (percentages) of prices of 46 commodities in 
1908 on their average prices in ihe years 1867-77 — 40, 43, 43, 46, 46, 46, 
64, 56, 59, 62, 64, 64, 66, 66, 67, 67, 68, 68, 69, 69, 69, 71, 75, 75, 76, 76, 
78, 80, 82, 82, 82, 82, 82, 83, 84, 86, 88, 90, 90, 91, 91, 92, 95, 102, 127 
Find the mean and standard deviation (1) without further grouping ; (2) 
grouping the numbers by fives (40-, 45-, 50-, etc ) , (3) grouping by tens (40-, 
60-, 60-, etc ) 

6 (Continuing from Qu 8, Chap. YII.) Supposmg the frequencies of 
values 0, 1, 2, 3, . of a variable to he given by the terms of the binomial 
senes 

q^, n p , .... 

where p + g' = l, find the standard deviation. 

7. {Cf the lemarks at the end of § 17.) The sum of the deviations (with 
out regard to sign) about the centre of the class-interval contaming the mean 




156 


THEORY OF STATISTICS 


(or median), in a grouped frequency-distribution, is found to be S Find the 
collection to be applied to this sum, in order to reduce it to the mean (or 
median) as origin, on the assumption that the obseivations are evenly dis 
tnbuted over each class-interval. Take the number of observations below the 
interval contaming the mean (or median) to be Hi, in that interval 712 , and 
above it , and the distance of the mean (or median) from the arbitrary 
origin to be d 

Show that the values of the mean deviation (from the mean and from the 
median respectively) for Example 11 , found by the use of this foimula, do not 
diffei fiom the values found by the simpler method of §§ 16 and 17 m the 
second place of decimals 

8 (W Scheibner, “ Ueber Mittelwerthe,” Berichte der kgl sachsiSLhen 
Gesellschaft d TFzsseTLSchaften, 1873, p 564, cited by Fechner, ref 2 of 
Chap YIL the second form of the relation is given by G Duncker {Die 
Methode der Variaiionsstatistik , Leipzig, 1899) as an empirical one ) Show 
that if deviations aio small compaied with the mean, so that [xfMf and 
higher powers of xfM may be neglected, we have approximately the relation 

where G is the geometric mean, M the arithmetic mean, and tr the standard 
deviation and consequently to the same degree of approximation - G®— 

9 . (Scheibner, loc ai , Qu 8 ) Similarly, show that if deviations are small 
compared with the mean, we have approximately 


H being the harmonic mean 



CHAPTER TX. 

CORRELATION 

1-3 The correlation table and its foimatiou— 4-5. The correlation surface — 
^-7. The general problem — 8-9. The line of means of rows and the 
line of means of columns their relative positions in the case of 
mdependence and of varying degrees of correlation — 10-14 The 
correlation coefficient, the regressions, and the standard-deviations of 
anays— 15-16 Numerical calculations— 17 Ceitain points to be 
remembered in calculating and using the coefficient, 

1 In chapters VI.-VIII. we considered the frequency-distribu- 
tion of a single variable, and the more important constants 
that may be calculated to describe certain characters of such 
distributions We have now to proceed to the case of two 
variables, and the consideration of the relations between them. 

2 If the corresponding values of two variables be noted 
together, the methods of classification employed in the preceding 
chapters may be applied to both, and a table of double entry or 
contingency- table (Chap Y ) be formed, exhibiting the frequencies 
of pairs of values lying within given class-intervals Six such 
tables are given below as illustrations for the following 
variables — Table I , two measurements on a shell {Pecten). 
Table II , ages of husbands and wives in England and Wales in 
1901 Table III , statures of fathers and their sons (British) 
Table lY., fertility of mothers and their daughters (British 
peerage) Table Y , the rate of discount and the ratio of reserves 
to deposits in American . banks Table YI , the proportion of 
male to total births, and the total numbers of births, in the 
registration districts of England and Wales 

Each row in such a table gives the frequency-distribution of 
the first variable for cases in which the second variable lies 
within the limits stated on the left of the row Similarly, every 
column gives the frequency-distribution of the second variable 
for cases m which the value of the first variable lies within the 
limits stated at the head of the column As ‘‘columns” and 
“rows” are distinguished only by the accidental circumstance 

167 



158 


THEORY OF STATISTICS, 


l§ 

S § 
*1 S 
^ S 

iSi 

o 


Cl H 
« 

1 ". 




s P. 

03 > 

l^P 

|P=> 

'So 

^ a 




bD 




^ rO 

1 0 
0 

'. 

n 3 

H 9 

ip j? 
fdO 
-< 

E-t 


Total 

'<j<a>Q0 05-^Ot>-Oi— l^OCN'N 
rHrftiO-^HO'sHOOW 
tH r- ! 

t'* 

CO 

i£5 

76-78. 

1 1 1 1 1 1 M 1 1’^=" 

CO 

73-75 

1 1 1 1 1 1 1 1 1 1 M 

1 

70-72. 

MINIM 1 

CO 

67-69 

1 11 11 11 !”•“ M 

ao 

64-66. 

M 1 M M"-?? 1 11 

o 

(M 

t)l-63 

1 11 1 1 isg'' 1 1 1 

1 

00 

00 

58-60. 

1 1 1 1 1 O) >H 1 1 1 1 

112 

55-57 

11 11 "gs 1 11 11 

47 i 103 

* 

52-64 

1 1 1 1 1 11 M 

49-51. 

1 M§:|§^ 11 11 1 M 


46-48. 

1 1 3^-=^ 11 11 1 11 

CO 

43-45, 

11 1 11 1 1 1 


40-42, 

1 11 M 1 11 11 

CO 

r-t 

37-39 

1 1 1 1 1 1 1 1 1 1 

IQ 

O5(MW500i-»rl<ls.OC0<yD0iCq ,-H 

co-^H'^'^ioursioeoco^oow ca 

1 1 « I 1 1 1 1 1 1 1 1 

JC^OCO«OOSC<^^^5C01H'«^^^-0 ,P 
co'^-^'^'TTiiou:sioco«o<oi>- cH 


(2) Dorso-ventral diameter, mm. 




IX — CORKBLATION. 


159 


Total 

-cl«OQOt^COOlOCOO>l>.XO"^000-^ 
'<»<00j— (C3lOC»00<0t^t>.OO»-l 
C^COCOt'.t'-LO-idHCOOl !-«»-< 

!>. 

CO 

xo 


1 

la 

CO 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 

rH 


i 

00 

1,1111,1, 

00 ' 


*1 

uo 

i>. 

1,1111,1, 

CM 


1 

o 

1 1 1 i 1 i i 

OO 

CD 


1 

iO 

«o 

1 1 1 J 1 jr-iCl<PCOOOTHO(NW 

I 1 i i i 1 (N XO CO »-• 

H* 

CO 

rH 


1 

o 

CO 

1 1 1 1 frHCSIOxOiHCOOOvrSr-tj 

1 1 1 1 1 rH CO O xa rH i 

CO 

CM 

CM 

OT 

<0 

> 

1 

lO 

1 ! 1 1 tH (M O *91 I~t I-H CO 00 CO *-l 1 
till i-« rri 00 01 1 

rH 

i>. 

rH 

CO 

o 

V2 

a> 

&0 

f 

O 

iO 

1 1 1 iH (N CN a> xo o OS r-i »a (N t-i i 

ill iH xa Os rH CO iH 1 

iH r-l 

CO * 

HP * 


» 

t |rHC^fNCO(N<DCOCOCOC^TH | 1 

11 iH CD xa tH rH t 1 

(N rH 

550 


I 

o 

j fjMOaOOSOOI'-.OOOOCOrHrH f I 
II rH CO O «>- xa rH 11 

CO IH 

669 


1 

CO 

1 »H O HI OS OS CO OS CO CO rH rH 1 1 1 

1 rH OO CD r-i ?c> rH III 

CO CM 

rH 

OO 


1 

o 

CO 

|T}l<^rHFHrHOOOCOrHrHl | 1 I 

1 OO iH XO CM 1 1 1 i 

Hi CM 

■H 

xa 

00 


1 

lO 

(W 

1 1 1 1 1 1 1 

HI CM 

OO 

o 

CO 


1 

o 

(N 

1 1 M M II 

iH iH 

Tji 

rH 

Hi 


1 

xo 

tH 

1 1 1 1 1 1 1 1 1 1 1 

CO 

CM 



1 1 1 i I 1 1 1 1 1 I 1 1 I 1 

xaoxaoxaoxooxaoxaoiQoxa 

iHC‘?CMCOCOHlTHXOXOCDCOt'-l'-COCO 

Total 


(2) Ages of Husbands. 



160 


THEORY OF STATISTICS 


I 


SS? 

c« o 
2 cs 

S 

cq.S 


ll 


2,:g 

1 

l§ 

rJfi W 
& O 


I g 


l§ 

<a r-l 

•° a 

15 

Is 

|i 

S' f~i 


-I 


S 

PA 

EH 


Total. 

u} to ifi ta to va lO to >o 

oai-reoocOT-ioioocccioQoo«Na>co-^’«co 
©lOQcooO'^r-tttosocO'^iis^ 
iH r*( i— t 

1078 

74 6-75 6 

1 1 i 1 1 1 1 1 1^^^ r« 1 1 1 1 1 

56 

73 6-74 6 

1 i M M I ( 1 n 1 1 1 


72 6-73 6 

into lOiotdto tOto 

1 1 1 1 1 1 1 >1 lO 05 CvJ t— lO £— C5 

‘ » ‘ 1 > » irto-5oaeotoo5e(5MH>-'»-< 

286 

71 6-72 6 

to kO to lO lO to 

III III tOiOtOO] tOC^(Mx> |CS<M 

1 • 1 1 1 1 r-l Ob eo Q 00 00 50 CO IH 1 

Os 

70 5-71 6 

to to to to 

1 1 jO}(>5rtO<NtO tax- I0^t0t0| 1 | 

1 I * p-l 1 NcboOO>jHOOl>«0 05 III 


69 6-70 6 

to to to to to to 

. , ,05t-tQ E-<NiOlOt-<N 1 1 1 

1 1 1 ca <M CO CO rH OS o »-< Cp <N | I r1 1-4 1 

r-i i-H <N r-l M 1—1 j 

1 

CP 

rH 

68 6-69 6 

to lO to to to to 

..totooitoto t 0 J>Jr-t 00 J o5cn . 

1 1 1 1 iHCOiONgaga^OCOOJCM r-l 1 

141 5 

67 5-68 6 

5 

125 

1 25 

5 6 

16 

19 6 

23 6 

24 

19 5 

19 

7 76 

7 6 

5 25 

1 

1 25 

1 26 

164 

66 6-67 6 

to iC to to 

1 «5toi>to 1 1 1 1 1 1 

1 i-l (M CO tx X- to H CO r-l O (N Mill 
rl ©3 CO iH i-t r-l 

to 

eo 

tH 

65 6-66 5 

lO lOiOtO tOtOkOtOtOlO 
. .04 <M t- 1> to 05 Ot X- 04 Jt- to i i | i i 

1 1 lO os O to CD ■*« CO CO 00 tH rH I 1 i i i 

1— 1 iH 04 C4 IH iH 

142 

64 5-65 6 

to to to to to to 

|C4 to X'-tNt-t- ^^illll 

rS 1 ^ CO CO O OS O 04 to tO CO r-l M 1 1 1 

to 

S 

63 5-64 6 

to to to to 

tO| OSiOtOX— tOC404||||||| 

1 t-l 04 OS OS CO O to C4 *0 1 1 1 1 1 1 1 

to 

rH 

vD 

62 6-63 6 , 

to to tO to 

to to to 04 C4 04k0t004 III 1 1 1 i i 

04 CQ CO to CO t- to 1-1 rM ‘ ' 

to 

61 5-62 6 

,,,tOiO tOtO 11. .11, 11. 

' 1 1 1 1 1 1 1 1 1 1 1 1 

00 

60 6-61 6 

1 11 1 1 1 i 1 1 1 1 

69 5-60 5 

1 1““ 1 M 1 1 1 1 1 1 1 1 1 

to 

CO 

6S6-69 6 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 n ! , 

CO 

i 

1 

1 

69 5-60 5 

60 5-61 6 

61 6-62 6 

62 6-63 6 

63 5-64 5 

64 5-65 6 
66 6-66 6 

66 6-67 6 

67 5-68 6 

68 5-69 6 

69 6-70 6 

70 5-71 6 

71 5-72 6 

72 6-73 6 

73 6-74 6 

74 6-75 5 
76 6-76 6 

76 6-77 6 

77 6-78 6 

78 6-79 6 

Total 


(2) Stature of Sou 



rX. — COEKELATION. 


161 


Im 


11 

C) 

§ i 

^ -+3 


85 

'S <Q 

Ci I 

5^ 


O'+a <D 

|«| 


-Trfi 

S rJ 

g 'O s 

^§“ 
e I ► 

SS 2 

I 8^i 

0'S o' 


Isa 

^ii 


SI'S 

>$ ei 

s^ s 

gi I 
1 1^ 
I't § 

^ oa 

W !3 cS 

j >* rH 

W 

H 



OoOt'-lOCOCOCOCOCOTj<^OOCOCO 

O 


r-< OS CS O CO (M O t"- t'- CO <M i-l 

O 


D 

r-^ rH r4 j— ( 1 — i 

o 

! ^ 




to 

1 1 1 i 1 1 ! 1 11 1 1 ! 

rH 


iO 

rH 

1 1 1 1 1 1 1 1 1 1 1 1- 1 

1—4 


•T*i 

! 1 1 I 1 ! 1 1 1 1 I 1 

05 


rH 

i ] 1 1 1 1 1 I t ! i I 



CO 

I 1 1 j»HCMlHrH(NJ I |rH(M 

O 



i 1 1 i ill 

rH 


c4 

COdjCOiOO^COrHrHrHrHl 1 1 

05 


rH 

i 111 

05 



<M (M 05 (M tH CO CO ««4« 0< 1 I rH rH 

VO 


rH 

1 t 

05 

§ 

o 

CO CO rfl 00 05 OO iO O CO 05 05 (M 1 | 

05 

s^ 

1— J 

rH 1 I 

VO 

'TS 








rC 




o 


CO)005'^*4-<3<'>^<I'>.i005COCOtH 1 1 

CO 



iH rH rH | 1 


'i-* 
















o 

♦ 

OSOOCOOOCO'^OOlCSiOO IrH 1 

05 

s 


rH rH rH rH 1 { 

OS 

it-, 




o 






COOiOjCOOOCMiaasOOW^ IrH 1 

CO 



r-, r- rH r-i 1 1 

rH 

1 




^ I 






iOCOUOH*fii30sC5 03ai'^COrH IrH 

rH 

1 

<o 

rH rH r-l rH rH rH rH 1 

05 




rH 


w 




* 

rHOCOOJiHCOiOTjlOSCOrHrHCOOl 

O 


Id 

05 rH rH 05 05 iH 

rl« 




rH 



OO «3 VO »H t". rH 00 05 CO 05 rH 05 1 

05 



rHrHrHrHrHrHrH rH 1 

Sh 



rH HI O «D OS 1>- 00 CO tH rH 05 rH \ 

o 


CO 

rH rH rH rH rH | i 

o 




rH 



0>»0050vr5tOVOHH05| 1 1C5| 



CM 

rH Mil 

VO 



VO 05 OS VO lO l>- HI VO rH 1 1 1 1 t 

CO 


r— 1 

rH M 1 M 

VO 







OrHC5COHiiO«DC'-OOOSOrHG5CO 

s 



rH rH rH rH 

o 






(2) ISTumber of her Daughter’s Oiuldren. 


11 




Table V , — Oorrelaiwn between (1) Gall Dwcourd iJaies and (2) Percentage of Peserves on Deposits %n New Yorlc Associated Banks 
(Weekly Returns) (From Statistical Studies in ike New York Money Maikct^ by J. P. Norton Publications of the 
Department of ike Social Sciences^ Yale Uni'iersity , The Macmillan Go,, 1902.) Note that, after the column headed 
8 per cent , blank columns have been omitted to save space. 


162 THEORY OF STATISTICS. 


Total 

(MrHT-<03«MVO'«!HVOOaCOCOCOOlTH'^OOa»Hr-<VOOOiHi-HCq 1 

•^00<N»HOVOOOVOeOr-i»-Sr - 1 . 1 

r-1 rH r-i j 

o 

00 

(1) Call Discount Rates [ 

t 

VC5 

Ml^l'^ll’^lllllillMIIMIl 


O 

(N 

1 1 i 1 1 ^ 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 


rH 

1 1 1 1 1 i 1 1 1 1 M 1 1 1 I M 1 1 1 

CO 

(N 

1 1 I 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 

rH 

O 

1 1 1-" 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 


cb 

00 

1 1 1 « 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 11 1 

o 

rH 

VO 

M 1 1 1 I 1 1 1 1 M 1 1 1 1 1 1 II 1 M M 

1 


1 1 1 M M 1 1 1 1 1 1 M 1 1 M 

oo 

1H 

VO 

CO 

1 1 1 1 i 1 1 1 1 1 M M 1 < < , 1 1 

O 

CO 


VO 

CO 

>p 

VO 


o 

CM 

VO 

VO 

^ 1 w 1 1 1 1 M 1 M 1 1 1 1 M 

VO 

1 1 1 1 1 { 1 1 1 1 1 1 1 1 

VO 

HI * 


1- 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

Cq 

VO 

vp 

<W 

1 1 1 1- 1 1 1 1 1 1 1 1 i 1 1 i 

O 

HI 

CO 

MM 1 11 M 1 11 11 M M 

Ol 

CO 

VO 

(N 

11 11 1 1 1 11 11 1 11 11 1 1 

o 

t>. 


Mill 1 11 1 1 M 

VO 

CM 

rH 

VO 

rH 

1 1 1 1 1 1 j 

CO 

CX) 

rH 

j j j j j j j jC0»-HC0VOO(MQ01>.0003^J>-t^C0iHrHC^|rH 

r-l 


»HC>JCO'^VOCOV>»OOaiOrHC,'^CO'^VOCO^-OOa>OrH01CO-^VO 

d(M«N<MCMC<l(N(NC<COCOCOCOOOCOCOCOOOCO''J<'i:ti'^'<!t<'<il>,it 

(2) Percentage Ratio of Reserves to Deposits. 







IX. — COEEELATION. 


163 








e „ « 

^ eS S 

"|l 

|n^ 

"S o is 
2 

1-1 

S o 
a» ^ 


Total 

Cs^csoocououONt'-cOoo^f-it-couseoeo'^ ir-teao? JN r-ir-t 
O 00 '(i* 04 i-t r-4 i-i r-< r-t J 1 

rH OI 

CO 

(1) Proportion of Male Births per 1000 of all Births 

543-45 1 

Hiiii|iii|i|i|yiiiiiiiiiiii 

- 

640-42 

BIlllMlllllimilllllllMUli 


537-39 

MHimiiHiimimiiniB 

1 

634-36 1 


CO 

531-33 

BmiiliiiHMmiHHHim 

r-4 

528-30 

- 1 i M 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 M 


625-27 

{ ( -• i M M i li 1 i I i 1 1 1 1 ! 1 1 ! i M 1 

- 

622-24. 

1 M U M 1 1 1 1 1 t M 1 1 } M 1 1 1 ! 

r-t 

619-21 

M 1 i 1 i i 1 1 1 1 1 1 i j 1 i M 1 M 

CO 

C4 

616-18 

J M 1- ! } 1 J M 1 M M 1 1 1 J i 

*« 

613-15 

;50000^«O^^M04r..H 1 1 1 1 1 1 1 1 i 1 1 1 i 1 1 { 


610-12 

j j j0^r.e4H j ] j j 1 J 


607-09. 

^ CO O lO kO CO 00 -kfl W US O lO j T-lr-t i CO j i-k 04 r-( . . . i-t i-t 

0 

01 

604-06 

CSt-iOOkOr-iOOOJCOI |(MC4C4f-tC(S|04rH| . 1 1 1 1 1 ( ( 

oa 

501-03 

1 1°* 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 

CO 

tB 

498-600 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 ! 1 1 1 1 

oo 

r-l 

495-97 

1 1 1 1 1 1 M I 1 1 1 1 1 1 1 1 I 1 1 j 1 N 1 

to 

492-94 

^00 i t i 1 1 1 ( 1 1 1 { t 1 1 i i i 1 { i ii 1 i t i 

CD 

489-91 

1 i i 1 t ( M i 1 { i i I 1 1 ! i i i 1 1 1 t 1 1 


486-88 

it i 1 ! 1 i 1 M 1 i 1 1 1 1 1 1 1 1 1 1 1 i { { i 

04 

483-85 

i i 1 i I 1 1 1 1 i i il 1 { I 1 f 1 ! 1 ii i 1 i 1 

CM 

480-82 

^ { 1 ! 1 i 1 1 M i i i i 1 1 il } t 1 i ! i i 1 i 1 

P4 

477-79 

1 1 M 1 11 I 1 M 1 ( 1 U t i i M 1 M 1 1 1 i 

I 

474-76 

i I 1 i M i ! li 1 1 1 i M M 1 i i ! M 1 t 1 1 

1 

471-73 

^ 1 1 1 1 M i M 1 1 1 i I 1 1 il 1 1 M M 1 I 1 


468-70 

1 1 1 1 1 1 1 1 1 1 1 1 1 i i 1 1 i 1 } 1 ! II i t 1 1 

1 

465-67 

^ i It 1 1 1 i it 1 1 1 1 1 1 1 i M I i 1 1 N 1 I 

r-i 

■^00 04<SO-«*'00C'l«DO'#00e^OO'!i(C0W«OO'^00(MC0O'^00(M 
iH f-( CQ 00 CO 'd* TJ1 kO lO ® <» «0 t- 00 cw 00 OS C3 O O O iO 

tH .-1 C4 O'! <M eo CO IQ US O CO O t- J>00 00 o5 ® OS o o ^ 

1 -t r< r-t “ 


(2) Total Number of Births m District (OOO's omitted) durmg Decade 





164 


THEOKY OF STATISTICS 


of the one set running vertically and the other horizontally, and 
the difference has no statistical significance, the word array 
has been suggested as a convenient teim to denote either a ro\y 
or a column. If the values of X in one array aie associated 
with values of 7 between the limits - S and + S, 7^ may be 
termed the type of the array (Peaison, ref 6 ) The special 
kind of contingency tables with which we are now concerned 
are called correlation tables, to distinguish them from tables 
based on unmeasured qualities and so forth. 

3. Nothing need be added to what was said in Chapter YI as 
regards the choice of magnitude and position of class-intervals 
When these have been fixed, the table is readily compiled by 
taking a large sheet ruled with rows and columns properly 
headed m the same way as the final table and entering a dot, 
stroke, or small cross m the corresponding compaitment for each 
pair of lecorded observations If facility of checking be of 
great importance, each pair of recorded values may be entered 
on a separate card and these dealt into little packs on a board 
ruled m squares, or into a divided tray , each pack can then be 
run through to see that no card has been mis-sorted The 
difficulty as to the intermediate observations — values of the 
variables corresponding to divisions between class-intervals — will 
be met in the same way as befoie if the value of one variable 
alone be intermediate, the unit of frequency being divided 
between two adjacent compaitments. If both values of the pair 
be intermediates, the obseivation must be divided between four 
adjacent compartments, and thus quarters as well as halves may 
occur m the table, as, e.y., in Table III In this case the statures 
of fathers and sons were measured to the nearest quarter- 
inch and subsequently grouped by 1-inch intervals : a pair in 
which the recorded stature of the father is 60 5 m and that of 
the son 62 6 in. is accordingly entered as 0 25 to each of the 
four compartments under the columns 59 5-60 5, 60 5-61 5, and 
the rows 61 5-62 5, 62 5-63 5 Workers will generally form 
their own methods for entering such fractional frequencies 
during the process of compiling, but one convenient method is 
to use a small x to denote a unit and a dot for a quarter , the 
four dots should be placed in the position of the four points 
of the X and joined when complete It is best to choose the 
limits of class- intervals, where possible, in such a way as to avoid 
fractional frequencies. 

4 The distribution of frequency for two variables may be 
represented by a surface or sohd m the same way as the frequency- 
distribution of a single variable may be represented by a plane 
figure We may imagine the surface to be obtained by erecting 



IX — OOERELATION. 


165 


at the centre of every compartment of the correlation-table a 
vertical of length pi oportionate to the frequency in that com- 
partment, and joining up the tops of the verticals If the 
compartments were made smaller and smaller while the class- 
frequencies remained finite, the irregular figure so obtained would 
appi 0 X 1 mate more and more closely towaids a continuous curved 
surface — -a frequency-surface — conesponding to the frequency- 
curves for single vaiiables of Chapter YI The volume of the 
frequency-solid over any area drawm on its base gives the 
frequency of pairs of values falling within that area, just as the 
area of the fiequency-curve over any inteival of the base-lme gives 
the frequency of observations within that interval. Models of 
actual distributions may be constructed by drawing the frequency- 
distributions for all arrays of the one variable, to the same scale, 
on sheets of cardboard, and erecting the cards vertically on a 
base-board at equal distances apart, or by marking out a base- 
board in squares corresponding to the compartments of the 
correlation-table, and erecting on each square a rod of wood of 
height proportionate to the frequency Such solid representations 
of frequency-distributions for twm variables are sometimes termed 
stereograms. 

5 It IS impossible, however, to group the majority of 
frequency-surfaces, m the same way as the frequency-cuives, 
under a few simple types : the forms are too varied. The simplest 
ideal type is one m which every section of the surface is a sym- 
metrical curve — the first type of Chap. YI (fig 5, p 89) Like 
the symmetrical distribution for the single variable, this is a very 
rare form of distribution in economic statistics, but approximate 
illustrations may be drawn from anthropometry Fig 29 shows 
the ideal form of the surface, somewhat truncated, and fig 
30 the distribution of Table III , which approximates to the same 
type, — the difiference in steepness is, of course, merely a matter of 
scale The maximum fiequency occurs in the centre of the 
whole distribution, and the surface is symmetrical round the 
vertical through the maximum, equal frequencies occurring at 
equal distances from the mode on opposite sides The next 
simplest type of sin face corresponds to the second type of 
frequency-curve — ^the moderately asymmetrical. Most, if not all, 
of the distributions of arrays are asymmetrical, and like the dis- 
tribution of fig 9, p 92 the surface is consequently asymmetrical, 
and the maximum does not lie m the centre of the distribution 
This form is fairly common, and illustrations might be drawn 
from a variety of souices — economics, meteorology, anthropometry, 
etc The data of Table II. will serve as an example. The total 
distributions and the distnbutions of the majority of the arrays 



166 


THEORY OF STATISTICS. 



Fig 29 —The ideal eymmetrical (“ normal ”) Frequency-Surface, with the extremes truncated. 




IX. — CORREtATIOK. 


16 ? 


are asymmetrical, the skewness being positive for the rows at 
the top of the table (the mode being lower than the mean), and 
negative for the rows at the foot, the more central rows being 
nearly symmetrical The maximum frequency lies towards the 
upper end of the table m the compartment under the row and 
column headed “ 30 - ”, The frequency falls off very rapidly 
towards the lower ages, and slowly in the direction of old age. 
Outside these two forms, it seems impossible to delimit empirically 
any simple types Tables Y. and VI. are given simply as illus- 
trations of two very divergent forms. Fig. 31 gives a graphical 
representation of the former by the method corresponding to the 
histogram of Chapter YI , the frequency in each compartment 
being represented by a square pillar. The distribution of 
frequency is very characteristic, and quite different from that 
of any of the Tables I , II, III , or IV. 

6 It IS clear that such tables may be treated by any of the 
methods discussed m Chapter V, which are applicable to all 
contingency-tables, however formed The distribution may he 
investigated in detail by such methods as those of § 4, or tested 
for isotropy (§ 11), or the coefficient of contingency can he 
calculated (§§ 5-8) In applying any of these methods, however, 
it IS desirable to use a coarser classification than is suited to the 
methods to he presently discussed, and it is not necessary to 
retain the constancy of the class-interval The classification 
should, on the contrary, be arranged simply with a view to avoiding 
many scattered units or very small frequencies. A few examples 
should be worked as exercises by the student (Question 3). 

7. But the coefficient of contingency merely tells us whether, 
and if so, how closely, the two variables are related, and much 
more information than this can he obtained from the correlation- 
table, seeing that the measures of Chapters VII. and VIII can be 
applied to the arrays as well as to the total distributions. If the 
two variables are independent, the distributions of all parallel 
arrays are similar (Chap V. § 13); hence their averages and 
dispersions, e,g means and standard deviations, must be the same 
In general they are not the same, and the relation between the 
mean or standard deviation of the array and its type requires 
investigation Of the two constants, the mean is, in general, the 
more important, and our attention will for the present he con- 
fined to it The majority of the questions of practical statistics 
relate solely to averages : the most important and fundamental 
question is whether, on an average, high values of the one variable 
show any tendency to be associated with high (or with low) 
values of the other If possible, we also desire to know how great a 
divergence of the one variable from its average value is associated 



168 


THEORY OF STATISTICS. 


with a unit divergence of the other, and to obtain some idea as to 
the closeness with which this relation is usually fulfilled. 

8. Suppose a diagram (fig 32) to be drawn representing the 
values of means of arrays. Let OX, 07 be the scales of the two 
variables, i e the scales at the head and side of the table, 01, 12, 
etc , being successive class-intervals Let be the mean value 
of X, and the mean value of Y If the two variables be 
absolutely independent, the distributions of frequency in all 
parallel arrays are similar (Chap V § 13), and the means of arrays 
must lie on the vertical and horizontal lines the 


00 7 2 4- 5 €X 



















( 















1 


-4- 


T 




— I — 




















> 






r 





Fig. 32. 


small circles denoting means of rows and the small crosses means 
of columns (In any actual case, of course, the means would not 
lie so regularly, but, if the independence were almost complete, 
would only fluctuate slightly to the one side and the other of the 
two lines ) 

The cases with which the experimentalist, e g the chemist or 
physicist, has to deal, where the observations are all crowded 
closely round a single line, lie at the opposite extreme from 
independence. The entries fall into a few compartments only of 
each array, and the means of rows and of columns he approximately 
on one and the same curve, like the line MR of fig 33 

The ordinary cases of statistics are intermediate between these 
two extremes, the lines of means being neither at right angles as 




IX.— CORRELATION. 


169 


in fig 32, nor coincident as m fig 33, but standing at an acute 
angle with one another as RR (means of rows) and CO (means of 
columns) m figs 36-8 The complete problem of the statistician, 
like that of the physicist, is to find formulae or equations which 
will suffice to describe approximately these curves 

9 In the general case this may be a difficult problem, but, in 
the first place, it often suffices, as already pointed out, to know 
merely whether on an average high values of the one variable 
show any tendency to be associated with high or with low values 
of the other, a purpose which will be served veiy faiily by fitting a 


0 ; 

z sm 

S 6 

Wk 






Wa 














W, 












M 







''M 

P 

W' 












i 

Wa 






s 


Jb IG 33. 


straight line , and further, in a large number of cases, it is found 
either (1) that the means of arrays he very approximately round 
straight lines, or (2) that they lie so irregularly (possibly owing 
only to paucity of observations) that the real nature of the curve 
IS not clearly indicated, and a straight line will do almost as well 
as any more elaborate curve. (Of, figs 36-38 ) In such cases 
— and they are relatively more frequent than might be supposed 
— the fitting of straight lines to the means of arrays determines 
all the most important characters of the distribution We might 
fit such lines by a simple graphical method, plotting the points 
representing means of arrays on a diagram like those of figures 
36-38, and ‘‘fitting” lines to them, say, by means of a stretched 
black thread shifted about till it appeared to run as near as 




170 


THEORY OF STATISTICS. 


might be to all the points But such a method is hardly satis- 
factory, more especially if the points are somewhat scattered^ it 
leaves too much room for guesswork, anddifierent observers obtain 
very different results Some method is clearly recj^uired which 
will enable the observer to determine equations to the two lines 
for a given distribution, however irregularly the means may lie, 
as simply and definitely as he can calculate the means and 
standard deviations 

10 Consider the simplest case in which the means of rows lie 



exactly on a straight line UK (fig 34). Let M 2 be the mean 
value of F, and let KB cut i/ga;, the horizontal thiough ifgj ^ 
Then it may be shown that the vertical through M must cut OX 
in Ml, the mean of X For, let the slope of KK to the vertical, 
i.e the tangent of the angle MjMK or ratio of LI to IM, be 
and let deviations from Mt/, Mx be denoted by x and y Then for 
any one row of type y in which the number of observations is n, 
'^{x^^nhiy, and therefore for the whole table, since S(?xy) = 0, 
5(a:) = 55 ^ 2 ( 71 ^) = 0. Ml must therefore be the mean of X, and 
M may accordingly be termed the mean of the whole distribution 
Knowing that KK passes through M, it remains only to determine 



DC. — ^CORRELATION. 


171 


This may conveniently be done in terms of tbe mean product 
p of all pairs of associated deviations x and y, ^ e — 

• ( 1 ) 

For any one row we have 

2(a;j/) = y2(x) = n 
Therefore for the whole table 

or = . (2) 

Similarly, if (7(7 be the line on which lie the means of columns 
and ^2 its slope to the horizontal, rs/^df, 


*2 = 


(3) 


These two equations (2) and (3) are usually written m a 
slightly different foim Let 

V 


Then 


I 

A =r— 


I ^ 


Or we may write the equations to ER and (7(7 — 


(4> 

(5) 


x — r^,y X . . . ( 6 ) 

cr* 

These equations may, of course, be expressed, if desired, m 
terms of the absolute values of the variables X and Y instead of 
the deviations x and y 

11 The meaning of the above expressions \\hen the means of 
rows and columns do not lie exactly on straight lines is very 
readily obtained If the values of x and h^.y be noted for all 
pairs of associated deviations, we have for the sum of the 
squares of the differences, givmg its value from (5), 

If be given any other value, say (r + S)— , then 

^y 

%{x - y)2 = iro',2(l r2 + 82). 


( 7 ) 



172 


THEOKY OF STATISTICS 


This IS necessarily greater than the value (7), hence 
has the lowest possible value when is put equal to rcrjay 
Further, for any one row in 'which the numbei of observations 
IS n, the deviation of the mean of the row from is d (fig. 35), 
and the standard deviation is Sa^i - b^yY = ns^J + n There- 
fore for the whole table, 

l.{x-b^ yf^%{nsJ)■¥^nd^) 

But the first of the two sums on the right is unaffected by the 



slope or position of RR^ hence, the left-hand side being a 
minimum, the second sum on the right must be a minimum also 
That is to say, when b-^ is put equal to r crJcTy, the sum of the squares 
of the distances of the row-means from RR, each multiplied by the 
corresponding frequency^ is the lowest possible 

Similar theorems hold good, of course, with respect to the line 

<7(7. If ^2 be given the value r l^{x-b,^.yy is a minimum, 

and also S(n e^) (fig 35). Hence we may regard the equations (6) 
as being, either (a) equations for estimating each individual x 
from its associated y (and y from its associated x) m such a way 



IX. — CORRELATION 


173 


as to make the sum of the squares of the errors of estimate the 
least possible ; or (6) equations for estimating the mean of the 
associated with a given type of y (and the mean of the 3^8 associated 
with a given type of x) m such a way as to make the sum of the 
squares of the errors of estimate the least possible, when every 
mean is counted once for each observation on which it is based 


Aye- of Wife 



Fig 36 —Correlation between Age of Husband and Age of Wife m England 
and Wales (Table II.) means of rows shown by circles and means of 
columns by crosses r = +0 91 

The lines represented by the two equations are thus, m a certain 
natuial sense, “lines of best fit ” to the two actual lines of means 
12 The constant r is of very great importance. It is evi- 
dently a pure number, and its magmtude is unaffected by the 
scales in which x and y are measured, for these scales will 
affect the numerator and denominator of ( 4 ) to the same 
extent If the two variables are independent, r is zero, for 
and 62 are zero (c/. § 8) The sign is the sign of the mean 

product and accordingly r is positive if large values of x 




174 


THEOEY OF STATISTICS. 


are associated with large values of 3 /, and ponversely (as in 
Tables L-IY ), negative if small values of x are associated with 
large values of y and conversely (as m Table V.) The numerical 
value cannot exceed ± 1 , for the sum of the series of squares 
m equation ( 7 ) is then zero and the sum of a series of squares 
cannot be negative If r = ±1, it follows that all the observed 
pairs of deviations are subject to the relation x[y=^o-Jcryi this 

Father^ staUtre 



Fig 37.— Correlation between Stature of Father and Stature of Son (Table 
III ) means of rows shown by circles and means of columns by crosses : 
r= +0 51. 


would be the case if the circles and crosses m such a diagram as 
fig 33 all lay on one and the same straight line. From these 
properties r is termed the coefficient of correlation, and the 
expression (4), r-])l(r:,(Ty^'%{xy)IN.ar^cryy should be remembered 
It should be noted that, while r is zero if the variables are 
independent, the converse is not necessarily true * the fact that 
r IS zero only implies that the means of rows and columns 
he scattered rownd two straight lines which do not exhibit 



IX — COfiESLATIGN. 


175 


any definite trend, to nght or to left, upward or downward. 
Two variables for which r is zero are, however, conveniently 
spoken of as uncorrelated. Table YI and tig. 39 will serve as an 
illustration of a case in which the variables are almost nncor- 
related but by no means independent, r being very small (-0 014), 
but the coefficient of contingency G (for grouping of qu. 3) 0 47 
Figs 36, 37, 38 are drawn from the data of Tables II., Ill , and 
IV, for which r has the values 4-0 91,4-0*51, and 4 0*21 respec- 
tively, the correlation being positive in each case. The student 

N'turiber of Mother's Chzldrejv. 


t 3 Sji 7 & n 



Fig 38 — Con elation between number of a Mother’s Children and number of 
her Daughter’s Children (Table IV ) means of rows shown by ciicles 
and means of columns by crosses : r= 4 0 21 

should study such tables and diagrams closely, and endeavour to 
accustom himself to estimating the value of r from the general 
appearance of the table. 

13. The two quantities 



are termed the coefficients of regression, or simply the regressions, 
being the regression of x on or deviation in x coi responding 
on the average to a unit change in the type of and being 




176 THEORY OR STATISTICS. 

Similarly the regression of y on a? Whilst the coefficient of 
correlation is always a pure numbei, the regressions are only 
pure numbers if the two variables have the same dimensions, as 
in Tables I -IV. their magnitudes depend on the ratio of arja-y^ and 
consequently on the units m which x and y are measured They 
are both necessarily of the same sign (the sign of r). Since r is 


PsraportiorvoFMaZc hirdis -per JOOO 



Fig 39 Coi relation between births m a Registiation Distiict and Propor- 
tion of Male Baths per thousand of all baths (England and Wales, 
1881-90, Table VI ) means of rows shown by cades and means 
of columns by crosses . r = - 0*014. 

not greater than unity, one at least of the regressions must be 
not greater than unity, but the othei may be considerably greater 
if the ratio o‘al<^y or ctj^/ct^ be great The name regression arose 
from the term being first introduced in the case of inheritance of 
stature (Galton, refs 2, 3) In this case the two standard devia- 
tions are very nearly equal, so that both \ and are less than 
unity, say (using the more recent data of Table III.) 0 50 and 0 52. 




IX.— CORRELATION 


177 


Hence the sons of fathers of deviation x from the mean of all fathers 
have an average deviation of only 0*52a; from the mean of all sons ; 
t.e. they step back or “ regress ” towards the general mean, and 0 52 
may be teimed the “ratio of regression'^ In general, however, 
the idea of a “stepping back” or “regression” towards a more 
or less stationary mean is quite inapplicable — obviously so where 
the variables are different in kind, as m Tables Y. and YI. — 
and the term “coefficient of regression” should be regarded simply 
as a convenient name for the coefficients \ and K BR and GO 
are generally termed the “ lines of regression,” and equations (6) 
the “ regression equations.” The expressions “ characteristic lines,” 
“ characteristic equations ” (Yule, ref 8) would perhaps be better. 
Where the actual means of arrays appear to be given, to a satis- 
factory degree of approximation, by straight lines, we may say 
that the regression is lineai It is not safe, however, to assume 
that such linearity extends beyond the limits of observation. 

14. The two standard deviations 

Sg = cTj. = cTj, sjl - r^ 

are of considerable importance. It follows from (7) that s^ is the 
standaid deviation of and similarly Sy is the standard 

deviation of Hence we may legaid s^. and Sy as the 

standard eirois (root mean square eriors) made in estimating x 
from y and y from xhj the respective characteristic relations 

x=-h^y 

Sg may also be regarded as a kind of average standard deviation of 
a row about RR, and Sy as an average standard deviation of a 
column about GC In an ideal case, where the regression is 
truly linear and the standard deviations of all parallel arrays are 
equal, a case to which the distribution of Table III is a rough 
approximation, is the standard deviation of the a;-array and Sy 
the standard deviation of the y-array (cf Chap X § 19 (3)). 
Hence and Sy are sometimes termed the “standard deviations 
of an ays ” 

15 Proceeding now to the arithmetical woik, the only new 
expression that has to be calculated in order to determine r, b^, b^, 

and Sy is the product sum '^(xy) or the mean product^. As m 
the cases of means and standaid deviations, the form of the 
arithmetic is slightly different according as the observations are 
few and ungrouped, or sufficient to justify the formation of a 
correlation-table. In the first case, as m Example i below, the 
work IS quite straightforwaid 

Example i., Table VII. — The variables are (1) X — the estimated 

12 



178 


THEORY OF STATISTICS, 


Table VII Theory of Correlation. Example l 


1 

2 

X 

Estimated 

3 

Y 

Percent 

4 

X 

6 

y 

6 

7 

8 9 

Products xy 

Union 

Average 
Earnings 
of Agn- 
cultural 
Labourers 
Shillings 
and Pence 
per Week 

age of 
Popula 
tion in 
receipt 
of 

Poor* 

law 

Belief 

Devia. 

tion 

X from 
Mean 
(Pence) 

Devia- 
tion 
y from 
Mean 

a;2 

2/2 

Posi- 

tive 

1 

Nega- 

tive 

3. Glendale 

s d 

20 9 

2 40 

+68 

-127 

3364 

16129 


73 66 

2 Wigton 

20 3 

2 29 

+62 

-1 38 

2704 

1 9044 



71 76 

3 Garstang 

19 8 

1 39 

+45 

-2 28 

2025 

6 1984 



102 60 

4 Belper 

18 6 

192 

+31 

-1 76 

961 

3 0626 



64 25 

5 Nantwich 

17 8 

2 98 

+21 

-0 69 

441 

0 4761 



14 49 

6 Atcham 

17 6 

117 

+19 

-2 60 

361 

6 2600 



47*50 

7 Driffield 

17 1 

3 79 

+u 

+0 12 

196 

0 0144 

1 68 


8 XJttoxeter 

17 0 

3 01 

+13 

-0 66 

169 

0 4356 


8 58 

9 Wetherby 

17 0 

2 39 

+13 

-1 28 

109 

1 6384 

_ 

16 64 

10 Eaaingwold 

16 11 

2 78 

+12 

-0 89 

144 

0 7921 

— 

10 68 

11 Southwell 

16 6 

3 09 

+ 7 

-0 68 

49 

0 3364 



4 06 

12 Hollingbourn 

16 4 

2 78 

+ 6 

-0 89 

25 

0 7921 



4 45 

13 Melton Mowbray 

16 3 

2 61 

+ 4 

-1 06 

16 

1 1236 



4 24 

14 Truro 

16 3 

4 33 

+ 4 

+0 66 

16 

0 4356 

2 64 


16 Godstone 

16 0 

3 02 

+ 1 

-0 65 

1 

0 4226 


0 66 

' 16 Louth 

16 0 

4 20 

+ 1 

1 +0 53 

1 

0 2809 

0 53 


17 Biixworth 

15 9 

129 

- 2 

-2 38 

4 

6 6644 

4 76 


18 Crediton 

16 8 

616 

- 3 

+1 49 

9 

2 2201 

4 47 

19 Holbeach 

16 6 

4 76 

- 5 

+1 08 

25 

11664 



5 40 

20 Maldon 

15 6 

4 64 

- 5 

+0 97 

26 

0-9409 



4 85 

21 Monmouth 

15 4 

4 26 

— 7 

+0 69 

49 

0 3481 



4 13 

22 StNeots , 

16 3 

1 66 

- 8 

-2 01 

64 

4 0401 

16 08 


23 Swaffham 

15 0 

5 37 

-11 

+1 70 

121 

2 8900 


IS 70 

24 Thakeham 

15 0 

3 38 

-11 

-0 29 

121 

0 0841 

3 19 


26 Thame 

16 0 

6 84 

-11 

+2 17 

121 

4 7089 


23 87 

26 Thingoe , 

15 0 

4 63 

-11 

+0 96 

121 

0 9216 



10 66 

27 Basingstoke 

15 0 

3 93 

-11 

+0 26 

121 

0 0676 



2 86 

28 Ciiencestei 

15 0 

4 64 

-11 

+0 87 

121 

0 7669 



9 67 

29 North Witchford 

14 10 

3 42 

-13 

-0 26 

169 

0 0626 

3 26 

80 Pewsey 

14 9 

5 88 

-14 

+2 21 

196 

4 8841 


30 94 

31 Bromyard 

14 9 

4 36 

-14 

+0 69 

196 

0 4761 



9 66 

82 Wantage 

14 9 

3 85 

-14 

+0 18 

196 

0 0324 



2 62 

83 Stratford on Avon 

14 7 

3 92 

-16 

+0 26 

266 

0 0625 



4 00 

84 Dorchester 

14 6 

4 48 

-17 

+0 81 

289 

0 6661 



13 77 

36 Woburn 

14 6 

6 07 

-17 

+2 00 

289 

4 0000 



34 00 

36 Buntingford 

14 4 

4 91 

-19 

+1 24 

361 

1 6376 



23 56 

87 Pershore 

13 6 

4 34 

-29 

+0 67 

841 

0 4489 

_ 

19 43 

38 Langpoit 

12 6 

6 19 

-41 

+1 62 

1681 

2 3104 


62 32 


Mean 

15 11 

Mean 

3 67 

“ 

— 

16,018 

20 5d 

63 0556 

^J/ 

129% 

32 13 

2(a:2/)= ■ 

698 17 

32 13 

-666 04 



IX.— COBRELAXION. 


179 


average weekly earnings of agricultural labourers in 38 English 
Poor-law unions of an agricultural type (the data of Example i , 
Chap VIII p. 137) (2) Y — the percentage of the population 

in receipt of Poor-law relief on the 1st January 1891 m each of the 
same umons {B return). The means of each of the variables are 
calculated m the ordinary way, and then the deviations x and y 
from the mean are written down (columns 4 and 5) , care must 
be taken to give each deviation the correct sign. These deviations 
are then squared (columns 6 and 7) and the standard deviations 
found as before (Chap. VIII p. 136). Finally, every x is 
multiplied by the associated y and the product entered in column 
8 or column 9 according to its sign These columns are then 
added up separately and the algebraic sum of the totals gives 
2 ^X 2 /)= 666 04 : therefore the mean product j? = 2(^y)/iV’= - 
17*53, and 

17 53 

20 5x1 29“"®® 

There is therefore a well-marked relation exhibited by these data 
between the earnings of agricultural labouiers m a distiict and 
the percentage of the population m receipt of Poor-law relief 
A penny is rather a small unit m which to measuie deviations m 
the average earnings, so for the regressions we may alter the unit 
of a; to a shilling, making cr,= 1 71, and 

-0-87, = -0 50, 

•'O'* 

The regression equations are theiefore, m terms of these units, 

-0 87y - 0*50a?. 

For practical purposes it is more convenient to express the 
equations in terms of the absolute values of the variables rather 
than the deviations therefore, replacing x by (X - 15 94) and y 
by (F- 3 67) and simplifying, we have 

X= 19*13 -0*87 7 ... (a) 

7=11 64-050X . (5) 

the units being Is for the earnings and 1 per cent, for the 
pauperism The standard errors made in using these equations 
to estimate earnings from pauperism and pauperism from earnings 
respectively are 

<r,Vr^=15 4d =1 28s. 

(T, n/I - r® = 0 97 per cent. 



180 


THEORY OF STATISTICS, 


The eqa?»^-’on (b) tells us therefore that a rise of 2s in earnings 
in passing from one district to another means on the average a 
fall of 1 in the percentage in receipt of relief A natural con- 
clusion would be that this means a diiect effect* of the higher 
earnings m diminishing the necessity for relief, but such a 
conclusion cannot be accepted offhand Equation (a) indicates, 
for instance, that every rise of a unit in the peicentage re- 
lieved conesponds to a fall of 0 87 shillings, or lOJd in earnings : 
this might mean that the giving of relief tends to depress wages 
Which is the coirect mteipietation of the facts? The above 



Fig. 40 —Correlation between Pauperism and Average Earnings of Agricultural 
Laboiuers for certain districts of England (data of Table VTl ) 

CGy lines of regression ; r= - 0 66 ’ 

regression equations alone cannot tell us this, and it is in the 
discussion of such questions that most of the difficulties of statisti- 
cal arguments arise. 

As a check on the whole of the arithmetical work, and to test 
whether the con elation coefficient is unduly affected by a few out^ 
lying observations, or, perhaps, by the regression not being Imeai, 
It IS always as well to diaw a diagram representing the results 
obtained Take scales along two axes at right angles (fig 40) 
representing the vaxiables, and insert a dot (better, for clearness, 
a small circle or a cross) at the point determined by each observed 
pair of X and F. Complete the diagram by inserting the two lines 



IX. — ^COERELATION. 


181 


RR and €C given bj the regression equations {a) and (h) In 
doing this it is as well to determine a point at each end of both 
lines, and then to check the work by seeing that they meet in the 
mean of the whole distribution. Thus RR is determined from (a) 
by the points 7=0, X=19 13 and 7=6, Z=13 91. CC is 
determined from {h) by the points X=12, 7=564 and -7=21, 
7 = 1*14. Marking in these points, and di awing the lines, they 
will be found to meet in the mean, -7=15 94, 7=3 67. The 
diagram gives a very clear idea of the distiibution , clearly the 
regression is as nearly linear as may be with so very scattered a 
distribution, and there are no very exceptional obser-v ations The 
most exceptional districts are Brixworth and St Keots with rather 
low earnings but very low paupeiism, and Glendale and Wigton 
V ith the highest earnings but a pauperism well above the lowest — 
over 2 pei cent 

16 When a classified cm relation -table is to be dealt with, the 
procedure is of precisely the same kind as was used m the calcula- 
tion of a standard deviation, the same artifices being used to shorten 
the work. That is to say, (1) the product-sum is calculated in the 
first instance with respect to an arbitrary oiigm, and is afterwards 
reduced to the value it w'ould have with respect to the mean , (2) 
the arbitrary origin is taken at the centre of a class-interval ; (3) 
the class-interval is treated as the unit of measurement throughout 
the arithmetic. 

Let deviations fiom the arbitrary origin be denoted by f 77, and 
let It; be the co-ordinates of the mean Then 

I = a; +1 ■q = y + ^, 

^^xy + ly + T)X-\r'^. 

Therefore, summing, since the second and third sums on the 
right vanish, being the sums of deviations from the mean, 

2 (fT7) = 2(a;y) + W|^, 

or bringing ^{xy) to the left, 

2Gry) = 2(177 ) -W|t7 

That is, in terms of mean-products, using p* tio denote the mean- 
product for the arbitiary origin, 

In any case where the origin from which deviations have been 
measuied is not the mean, this correction must be used. It will 
sometimes give a sensible correction even for work m the form of 



182 


THEORY OF STATISTICS. 


Example i , and in that case, of course, the standard deviations 
will also require reduction to the mean 
As the arithmetical process of calculating the correlation co- 
efficient from a grouped table is of great importance, we give two 
illustrations, the first economic, the second biological 

Example ii , Table VIII — The two variables are (1) X, the 
percentage of males over 65 years of age in receipt of Poor-law 
relief m 235 unions of a mainly rural character in England and 
Wales , (3) 7, the ratio of the numbers of persons given relief out- 
doors ” (in their own homes) to one “ indoors ” (m the workhouse) 
The figures refer to a one-day count (1st August 1890, No 36, 
1890), and the table is one of a series that were drawn up with 
the view to discussing the influence of administrative methods on 
pauperism. {Economic Journal^ vol. vi , 1896,. p 613 ) 

The arbitrary origin for X was taken at the centre of the fourth 
column, or at 17*5 per cent ; for 7 at the centre of the fourth 
row, or 3 5 The following are the values found for the constants 
of the single distributions : — 

1= -0 1532 intervals = -0’77 per cent, whence — 

16 73 pel cent 

(Taj = 1 29 intervals — 6 45 per cent 
?7= -f 0 36 intervals or unitfe, whence — 3 86 
cTj, = 2 98 units 

To calculate S(^>?), the value of is first written m every 
compartment of the table against the coi responding frequency, 
treating the class-interval as the unit: these aie the figuies in 
heavy type in Table VIII In making these entries the sign of 
the product may be neglected, but it must be remembered that 
this sign will be positive in the upper left-hand and lower right- 
hand quadrants, negative in the two others The frequencies are 
then collected as shown in columns 2 and 3 of Table VIIIa , 
being grouped according to the value and sign of $rj Thus for 
^>7=^1, the total frequency m the positive quadrants is 13-f85 
= 21*5, in the negative 14 q- 6 = 20 for ^77 = 2, 10-f 4*5 + 1 q-4*5 
= 20 in the positive quadrants, 5 + 2 + 1+ 35 = 11 *5 m the 
negative, and so on When columns 2 and 3 are completed, they 
should first of all be checked to see that no fiequency has been 
dropped, which may be readily done by adding together the totals 
of these two columns together with the frequency in row 4 and 
column 4 of Table VIII (the row and column for which ^77 = 0), 
being careful not to count twice the frequency in the compartment 
common to the two ; this grand total must clearly be equal to the 
total number of observations W, or 235 in the present case The 
algebraic sum of the frequencies in each kne of columns 2 and 3 is 



IX — CORHBLATION 


183 


Tablk YIII Theory of Correlation • Example h — Old age Pauperism and 
Proportion of Out-relief. (The Frequencies are the figures printed in ordi- 
nary type The numbers in heavy type are the Deviation-Products (It?) ) 


















184 


THEORY OF STATISTICS. 


Table YIIU Galculatiox of the Probbot Sum: 2(477) 


1 

2. 3. 

Frequencies 

4. 

Total 

6. 6 

Products 


4 

Q uadi ants 

Quadiants 

Positive 

Negative. 

1 

21 5 

20 

+ 1 5 

1 5 



2 

20 

11 5 

4 8 5 

17 

— 

3 

12 

2 

410 

30 

— . 

4 

18 

1 

417 

68 

— 

5 

1 

1 

. — 

— 

— 

6 

17 5 

1 

416 5 

09 

— 

8 

2 

0 5 

+ 1-5 

12 

— 

9 

1 5 

1 

+ 05 

4 5 

— 

10 

4 

0 5 

+ 35 

35 

— 

12 

— 

2 

- 2 

— 

24 

15 

1 

— 

4 1 

15 

— 

20 

— 

1 

- 1 

— 

20 

24 

1 

— 

4 1 

24 

— 

28 

1 

— 

4 1 

28 

— 

Totals 

100 5 

41 5 

93 

235 

41 5 


4 334 
- 44 

+ 290 

-44 


then entered in column 4, treating the frequencies m column 3 as if 
they were themselves negative, and finally the figures of column 4 
are multiplied by the values of ^ and the products entered in 
column 5 or 6 according to sign The algebraic sum of the totals 
of columns 5 and 6 = + 290 = Whence p' = = 1 234, 

To find the value of p we have, remembering that we are working 
with class-intervals as the unit, 


|77= - (0*153 x0'36)= -0 055 
==i?'-?^-l*234 4-0055= -f-1 289 


1*289 

1 29 X 2 98 


+ 0-34. 


The regression of pauperism on out-relief ratio is, reverting to 
1 per cent as the unit of pauperism instead of the class-interval, 



IX — COREELATION. 


185 


+ 0 34 X 6 45/2 98 = 0*74, and the regression equation accordingly 
x = 0 74y, or 

X=13 9 + 0 74r, 

the standard error made in using the equation for estimating X 
from Y being <t* Vl - = 6 07. 

This IS the equation of greatest practical interest, telling us 
that, as we pass from one district to another, a rise of 1 in the 
ratio of the numbers relieved in their own homes to the numbers 
relieved in the workhouse corresponds on an average to a rise of 
0 74 in the percentage in receipt of relief The result is such as 
to create a presumption in favour of the view that the giving of 
out-relief tends to increase the numbers relieved, and this can be 
taken as a working hypothesis for further investigation. 

The student should work out the second regression equation, 
and check both by calculating the means of the piincipal rows 
and columns, and drawing a diagram like figs 36, 37, and 38, 

Example in., Table IX — (Unpublished data ; measurements by 
G, U. Yule.) The two variables are (1) X, the length of a mother- 
frond of duckweed (Lemna minor) , (2) F, the length of the 
daughter-frond The mother-frond was measured when the 
daughter-frond separated from it, and the daughter-frond when 
its first daughter-frond separated Measures were taken from 
camera drawings made with the Zeiss- Abbe camera under a low 
power, the actual magnification being 24 1 The units of length 
in the tabulated measurements are millimetres on the drawings. 

The arbitrary origin for both X and l"was taken at 105 mm 
The following are the values found for the constants of the single 
distributions — 

1= -1 058 intervals = - 6*3 mm. Mi= 98 7 mm on drawing 

= 4 11 mm actual 

<ra.= 2 828 intervals = 17 0 mm on drawing = 0 707 mm actual 

7}= -0*203 ,, =- 1 2 mm 34=103 8 mm on drawing 

= 4 32 mm. actual. 

(ry= 3*084 ,, = 18 5 mm on drawing = 0 771 ram actual. 

The values of are entered in every compartment of the 
table as before, and the frequencies then collected, according to 
the magnitude and sign of in columns 2 and 3 of Table IXa 
T he entries m these two columns aie next checked by adding to 
the totals the frequency m the row and column for which ^ is 
zero, and seeing that it gives the total number of observations 
(266) The numbers m column 4 are given by deducting tbe 
entries in column 3 from those in column 2 The totals so 
obtained are multiplied by ^ (column 1) and the products enteied 



186 


THEOEY OF STATISTICS. 


Table IXa 


1. 

h 

1 

2. 3. 

Frequencies 

4. 

Total. 

5 6 

Pioducts 

+ 

Quadrants 

Quadrants 

+ 

- 

1 


8 5 

- 85 


8 5 

2 

17 

13 5 

+ 35 

7 

— 

3 

10 5 

9 

+ 15 

4 5 

— 

4 

13 5 

6 5 

+ 7 

28 

— 

5 

2 

0*6 

+ 1 5 

7 5 

— 

6 

135 

5 

+ 85 

51 



8 

13 

1 

+ 12 

96 



9 

9 

4 

+ 5 

45 



10 

6 5 

1 

+ 55 

55 



12 

17 5 

— 

+ 17 5 

210 

— 

14 

1 

— 

+ 1 

14 



15 

6 

— 

+ 6 

90 



16 

7 

— 

+ 7 

112 



18 

2 

— 

+ 2 

36 



20 

8 

— 

+ 8 

160 



21 

2 

— 

+ 2 

42 



24 

6 

— 

+ 6 

144 



25 

1 

— 

+ 1 

25 



28 

1 

— 

+ 1 

28 



SO 

3 

— 

+ 3 

90 



36 

1 

— 

+ 1 

36 



40 

1 

— 

+ 1 

S 40 



42 

2 

— 

+ 2 

84 



60 

1 

— 

+ 1 

60 

— 

63 

1 

— 

+ 1 

63 

— 

Totals 

145 6 

49 

_ 

+ 1528 

-8 5 


49 



- 8 5 



71*5 









1519 5 



266 






m column 5 or 6 according to sign. The Jilgebiaic sura of the 
totals of these two columns gives + 1519 5 Dividing 

by 266, p' = 6-712 But |ij= +1 058 x 0-203= +0 215 , theie- 
foreij = 6 712-0 215 = 5 497. 

5-497 

2-828 X 3 084 


r= + 


= + 0 63. 






IX — CORRELATION. 


187 


The regression of daughter-frond on mother-frond is 0 69 (a 
raliie which will not be altered by altering the units of measure- 
ment for both mother- and daughter-fronds, as such an alteration 
will affect both standaid deviations equally) Hence the re- 
gression equation giving the average actual length (in millimetres) 
of daughter-fronds for mother-fronds of actual length X is 

7=l‘48-f0 69X 

We again leave it to the student to work out the second 
regression equation giving the average length of mother-fronds 
for daughter-fronds of length 7, and to check the whole work 
by a diagram showing the lines of regression and the means of 
arrays for the central portion of the table 

17. The student should be careful to remember the following 
points in working — 

(1) To give and their correct signs in finding the true 
mean deviation-product 

( 2 ) To express 0 - 3 , and o-y m terms of the class-interval as a 

unit, in the value of cry, for these are the units in terms 

of which p has been calculated 

(3) To use the proper units for the standard deviations (not 
class-intervals in general) in calculating the coefficients of 
regiession in forming the regression equation m terms of the 
absolute values of the variables, for example, as above, the work 
will be wrong unless means and standard deviations are ex- 
pressed in the same units 

Further, it must always be remembered that correlation 
coefficients, like all other statistical measures, are subject to 
fluctuations of sampling (cf Chap III §§ 7, 8 ) If we write 
on cards a senes of pairs of strictly independent values of x and 
y and then work out the correlation coefficient for samples of, 
say, 40 or 50 cards taken at random, we are very unlikely ever 
to find r = 0 absolutely, but will find a senes of positive and 
negative values centring round 0 No great stress can therefore 
be laid on small, or even on moderately large, values of r as 
indicating a true correlation if the numbers of observations be 
small For instance, if W=36, a value of r=±0*5 may be 
merely a chance result (though a very infrequent one), if 
iV'==100, r=±0 3 may similarly be a mere fluctuation of 
sampling, though again an infrequent one If X — 900, a value 
of r — ±01 might occur as a fluctuation of sampling of the same 
degree of infrequency. The student must therefore be careful in 
interpreting his coefficients (See Chap XVII § 15 ) 

Fmally, it should be borne in mind that any coefficient, e g the 
coefficient of correlation or the coefficient of contingency, gives 



188 


THEORY OF STATISTICS, 


only a part of the information afforded by the original data or 
the correlation table. The correlation table itself, or the original 
data if no correlation table has been compiled, should always be 
given, unless considerations of space or of expense absolutely 
preclude the adoption of such a comse. 


REFERENCES. 

The theory of correlation was first developed on definite assumptions 
as to the form of the distribution of frequency, the so called “normal 
distnbution ” (Chap XYI ) being assumed. In (1) Biavais introduced 
the product-sum, but not a single symbol for a coefficient of correlation 
Sir Francis Galton, in (2), (3), and (4), developed thepiactical method, 
determining his coefficient (Galton’s function, as it was termed at first) 
graphically Edgeworth developed the theoretical side further in (5), 
and Pearson introduced the product-sum formula in (6) — both memoirs 
being written on the assumption of a “ normal " distribution of fre- 
quency (c/ Chap XYI ) The method used in the preceding chapter 
IS based on (7) and (8) 

(1) Bra VAIS, A , “ Analyse math4matique sur les probabiht^s des erreurs de 

situation d’un point,” Acad des Sciences M&moires p ismUs pwt diiers 
savants^ II<, s4rie, t ix , 1846, p 256 

(2) Galtov, Francis, “Regression towards Mediocrity in Hereditary 

Stature,” Jour Anthrop Inst , vol xv , 1886, p 246 

(3) Galton, Francis, “Family Likeness in Stature,” Pioc Roy. Soc , 

vol xl , 1886, p 42 

(4) Galton, Francis, “Correlations and their Measurement,” Proc Roy 

Soc , vol xlv , 1888, p 135 

(5) Edgeworth, F Y., “On Correlated Averages,” Phil Mag , 5th Senes, 

vol xxxiv , 1892, p. 190. 

(6) Pearson, Karl, “Regiession, Heredity, and Panmixia,” Phil Trans 

Roy Soc , Senes A, vol clxxxvii , 1896, p 253 

(7) Yule, G U , “On the significance of Bravais’ Foimulce foi Regression, 

etc , m the case of Skew Correlation,” Proc Roy Soc , vol lx , 1897, 
p 477. 

(8) Yule, G U , “On the Theory of Correlation,” Jour Roy Stat. Soc., 

vol. lx , 1897, p 812. 

(9) Darbishire, a D., * Some Tables for illustrating Statistical Correla- 

tion,” Mem, and Proc of the Manchester Lit gtiS Phil Soc,, vol h , 
1907. (Tables and diagrams illustratmg the meaning of values of the 
correlation coefficient from 0 to 1 by steps of a twelfth.) 

Reference may also be made here to — 

(10) Edgeworth, F, Y, “On a New Method of reducing Observations 

relating to several Quantities,” Phil, Mag., 5th Series, vol xxiv., 1887, 
p 222, and vol. xxv , 1888, p 184 (A method of treating correlated 
vanables differing entirely from that described in the preceding 
chapter, and based on the use of the median the method involves 
the use of trial and error to some extent For some illustrations see 
F Y. Edgeworth and A. L Bowley, Jour, Roy Stat Soc., vol Ixv., 
1902, p 341 ei seq ) 

RefereTices to memoirs on the theory of nonJinear regression are gvoem 
at the end of Chapter X 



IX — COREELATION. 


189 


EXERCISES. 

1. Find the correlation-coefficient and the equations of regression for the 
following values of X and V 

X Y 

1 2 

2 5 

3 3 

4 8 

*5 7 

[As a matter of practice it is never worth calculating a correlation -coefficient 
for so few observations the figures aie given solely as a short example on 
which the student can test his knowledge of the work.] 

2 The following figures show, for the districts of Example i. , the ratios of 
the numbers of paupers m receipt of outdoor relief to the numbers m receipt 
of relief in the workhouse Find the correlations between the out- relief ratio 
and (1) the estimated earnmgs of agricultural labourers, (2) the percentage 
of the population in receipt oi relief 


1 

6 ‘40 

14 

7-50 

27 

2 97 

2 

4 04 

15 

4-44 

28 

5 38 

3 

7 90 

16 

8 34 

29 

3 24 

4 

3 31 

17 

0 69 

30 

7 61 

5 

7 85 

18 

9 89 

81 

5 87 

6 

0 45 

19 

4 00 

82 

5 50 

7 

10 00 

20 

6 02 

83 

3 58 

8 

4 43 

21 

8-27 

34 

6 93 

9 

4 78 

22 

1*58 

35 

6 02 

10 

4 73 

23 

16 04 

86 

4 92 

11 

6 66 

24 

1 96 

37 

4 64 

12 

1 22 

25 

9 28 

88 

10 56 

13 

4 ‘27 

26 

8-72 




3. Verify the following data for the under-mentioned tables of the preceding 
chapter Calculate the means of rows and columns and draw diagrams showing 
the lines of regression, as figs 36-39, for one or two cases at least. 



1. 

II 

III. 

IV. 

VI. j 

Mean of X 

„ r . . 

Standard devia- ' 
tion of X 

Standard devia- ' 
tion of Y 

Coefficient of corre- ' 
lation 


56 3 mm. 
53 1 „ 

6 86 ,, 

6 77 „ 

+ 0 97 

40 6 years 
42 8 „ 

12 7 

13 1 „ 

-f0 91 

67 70 1113 

68 66 „ 

2 72 „ 

2 75 „ 

+ 0 51 

5 90 

4 33 

2 83 

2 97 

-fO 21 

609 2 
14,500 

7 46 

18,100 

-0 014 

Coefficient of con-^ 
tmgency (for the ' 
grouping stated 
below) . . ^ 

[ 

0-90 

0 81 

0 51 

1 

0 31 

0 47 * 

; 

1 



190 


THEOEY OF STATISTICS. 


In calculating the coefficient of contingency (coefficient of mean square 
contingency) use the following groupings, so as to avoid small scattered fre- 
quencies at the extremities of the tables and also excessive arithmetic — 

I Group together (1) two top rows, (2) three bottom rows, (3) two first 
columns, (4) four last columns, leaving centre of table as it stands 

II Kegroup by ten-year intervals (15-, 25-, 35-, etc ) for both husband and 
wife, making the last group 65 and over ” 

III Regroup by 2-inch intervals, 58 5-60‘5, etc , for father, 59 5-61*5, 
etc , for son If a 3-inch grouping be used (58 5-61 5, etc , for both father and 
son), the coefficient of mean square con tmgency is 0 465 [Both lesults cited 
from Pearson, ref. 1 of Chap Y.] 

IV For cols , group H-2, 3-f4, . . , 11-f 12, 13 and upwards Rows, 

0, 1 -f 2, 3 -f 4, . . , 9 -t- 10, 11 and upwards 

VI For cols , gioup all up to 494 5 and all over 521 5, leaving central cols 
Rows singly up 20 . then 20-28, 28-44, 44-56, 56 upwaids. 



CHAPTEE X, 


COEBELATION; ILLUSTRATIONS AND PRACTIOAL 
METHODS. 

1. Necessity for careful choice of variables before proceediug to calculate r — 
2-8 lUustratioii i . Causatioii of pauperism — 9-10. Illustration 
u Inbentance of fertility — 11-13 Illustration lii* The weather 
and the crops — 14. Correlation between the movements of two 
vanables — (a) Non-penodic movements: Illustration iv.: Changes 
m infantile and general mortality — 15-17. (6) Quasi- periodic move- 
ments lilusti’ation V The mainage-iate and foreign trade — 
18 Elementary methods of dealing with cases of non-linear regression 
— 19 Certain rough methods of appro\imating to the correlation 
coefficient — 20-22 The correlation latio. 

1 The student — especially the student of economic statistics, to 
whom this chapter is principally addressed — should be careful to 
note that the coefficient ot correlation, like an average or a 
measure of dispersion, only exhibits in a summary and compre- 
hensible form one particular aspect of the facts on which it is 
based, and the real difficulties arise m the interpretation of the 
coefficient when obtained. The value of the coefficient may be 
consistent with some given hypothesis, but it may he equally 
consistent with others, and not only are care and judgment 
essential for the discussion of such possible hypotheses, but also 
a thorough knowledge of the facts in all other possible aspects 
Further, care should be exercised from the commencement in the 
selection of the variables between which the correlation shall be 
determined. The variables should be defined in such a way as 
to render the correlations as readily mterpretable as possible, 
and, if several are to be dealt with, they should afibrd the answers 
to specific and definite questions. Unfortunately, the field of 
choice is frequently very much hmited, by deficiencies m the 
available data and so forth, and consequently practical possibilities 
as well as ideal requiiements have to be taken into account. No 
geneial rules can be laid down, but the following are given as 
illustrations of the sort of points that have to be considered. 

191 



192 


THEORY OF STATISTICS. 


2. Illustration i. — It is required to throw some light on the 
variations of pauperism m the unions (unions of parishes) of 
England. {Of. Yule, lef 2.) 

One table (Table VIII ) bearing on a pait of this question, viz. 
the influence of the giving of out-relief on the proportion of the 
aged in receipt of relief, was given m Chap. IX (p 183) The 
question was treated by coi'relatmg the peicentage of the aged 
relieved in difleient districts with the latio of numbers relieved 
outdoors to the numbers m the woikhouse Is such a method 
the best possible ? 

On the whole, it would seem better to correlate changes in 
pauperism with changes in various possible factors If we say 
that a high late of pauperism in some district is due to lax 
administration, we presumably mean that as administration 
became las, paupensm rose, or that if administration were more 
strict, pauperism would decrease ; if we say that the high pauper- 
ism is due to the depressed condition of industry, we mean that 
when industry recovers, pauperism will fall. When we say, in 
fact, that any one variable is a factor of pauperism, we mean 
that changes in that variable are accompanied by changes m the 
percentage of the population in receipt of relief, either in the 
same or the reverse diiection It will be better, therefore, to 
deal with changes in paupensm and possible factors The next 
question is what factors to choose 

3 The possible factors may be grouped under three heads : — 

(а) Administration. — Changes in the method oi strictness of 
administration of the law. 

(б) Environment — Changes in economic conditions (wages, 
prices, employment), social conditions (residential or industrial 
character of the district, density of population, nationality of 
population), or moral conditions (as illustiated, ey , by the statis- 
tics of crime). 

(c) Age Distribution — the percentage of the population between 
given age-limits in receipt of relief increases very rapidly with old 
age, the actual figures given by one of the only two then existing 
returns of the age of paupeis being — 2 per cent, under age 16, 
1 per cent, over 16 but under 65, 20 per cent over 65 (Return 
36, 1890.) 

It is practically impossible to deal with more than three factors, 
one from each of the above groups, or four variables alto- 
gether, including the pauperismutself What shall we take, then, 
as repiesentative variables, and how shall we best measure 
pauperism ” 

4 Paupensm. — The returns give (a) cost, {h) numbers relieved. 
It seems better to deal with (b) (as in the illustration of Table 



X, — COEEELATION : ILLDSTEATIONS AND METHODS 193 


VIIL, Chap. IX ), as numbers are more important than coat from 
the standpoint of the moral effect of relief on the population 
The returns, however, generally include both lunatics and vagrants 
in the totals of persons relieved , and as the administrative methods 
of dealing with these two classes differ entirely from the methods 
applicable to ordinary pauperism, it seems better to alter the 
official total by excluding them Eetums are available giving 
the numbers m receipt of relief on 1st January and Ist July, 
there does not seem to be any special reason for taking the one 
return rather than the other, but the return for 1st January was 
actually used The percentage of the population in receipt of 
relief on 1st January 1871, 1881, and 1891 (the three census 
years), less lunatics and vagrants, was therefore tabulated for each 
union. (The investigation was carried out in 1898.) 

5, Adrrdmstration — The most important point here, and one 
that lends itself readily to statistical treatment, is the relative 
proportion of mdoor and outdoor relief (relief in the workhouse 
and relief in the applicant's ‘home). The first question is, 
again, shall we measure this proportion by cost or by numbers \ 
The latter seems, as before, the simpler and more important ratio 
for the present purpose, though some writers have preferred the 
statement in terms of expenditure (e g Mr Charles Booth, Aged 
Foot — Condition^ 1894) If we decide on the statement in terms 
of numbers, we still have the choice of expressing the proportion (1) 
as the ratio of numbers given out-relief to numbers in the work- 
house, or (2) as the percentage of numbers given out-relief on 
the total number relieved. The former method was chosen, 
partly on the simple ground that it had already been used in an 
earlier investigation, partly on the ground that the use of the 
ratio separates the higher proportions of out-rebef more clearly 
from each other, and these differences seem to have significance. 
Thus a union with a ratio of 15 outdoor paupers to one indoor 
seems to be materially different from one with a ratio of, say, 10 
to 1 , but if we take, instead of the ratios, the percentages of 
outdoor to total paupers, the figures are 94 per cent and 91 per 
cent respectively, which are so close that they will probably fall 
into the same array The ratio of numbers in receipt of outdoor 
relief to the numbers in the workhouse, in every union, was 
therefore tabulated for 1st Januaiy m the census years 1871, 1881, 
1891. 

6 Environment , — This is the most difficult factor of all to deal 
with In Mr Booth's work the factors tabulated were (1) persons 
per acre ; (2) percentage of population living two or more to a 
room, ‘‘overcrowding" , (3) rateable value per head {Aged Poor — 
Condition) The data relating to overcrowding were first collected 

13 



i94 


THEORY OF STATISTICS. 


at the census of 1891, and are not available for earlier years 
Some trial was made of rateable value per head, but with not 
very satisfactory results For any given year, and for a group of 
unions of somewhat similar charactei, eg rural, the rateable value 
per head appears to be highly (negatively) coi related with the 
pauperism, but changes in the two are not very highly con elated . 
probably the movements of assessments are sluggish and irregular, 
especially in the case of falling assessments in rural unions, and 
do not correspond at all accurately with the real changes m the 
value of agricultural land After some consideration, it was 
decided to use a very simple index to the changing fortunes of a 
district, VIZ the movement of the population itself If the 
population of a district is increasing at a rate above the average, 
this IS primd facie evidence that its industries are prospering; if 
the population is decreasing, or not increasing as fast as the 
average, this strongly suggest that the industries are suffering 
from a temporary lack of prosperity or permanent decay. The 
population of every union was therefore tabulated for the censuses 
of 1871, 1881, 1891 

7. Age Di&ti ihuUon. — ^As already stated, the figures that are 
known clearly indicate a very rapid rise of the percentage relieved 
after 65 years of age. The percentage of the population over 65 
years of age was therefore worked out for every union and tabu- 
lated fiom the same three censuses This is not, of couise, 
at all a complete index to the composition of the population as 
affecting the rate of pauperism, which is sensibly dependent on 
the proportion of the two sexes, and the numbers of children as 
well. As the peicentage in receipt of relief was, however, 20 pei 
cent for those over 65, and only 1-2 per cent for those under that 
age, it IS evidently a most important index (A more complete 
method might have been used by correcting the observed rate of 
pauperism to the basis of a standard population with given num- 
bers of each age and sex. {Of below, Chap XI pp. 228-25 ) 

8. The changes in each of the four quantities that had been 
tabulated for every union were then measured by working out the 
ratios for the mtercensal decades 1871-81 and 1881-91, taking 
the value in the earlier year as 100 m each case The percentage 
ratios so obtained were taken as the four variables. Fuxther, as 
the conditions are and were very different for rural and for urban 
unions, it seemed very desirable to separate the unions into groups 
according to their character. But this cannot be done with any 
exactness • the majority of unions are of a mixed character, con- 
sisting, say, of a small town with a considerable extent of the 
surrounding country It might seem best to base the classification 
on returns of occupations^ the proportions of the population 



X — CORRELATION: ILLUSTRATIONS AND METHODS. 195 


engaged in agriculture, but the statistics of occupations are not 
given in the census for individual unions. Finally, it was decided 
to use a classification by density of population, the grouping used 
being — Eural, 0*3 person per acre or less. Mixed, more than 
0 3 but not more than 1 person per acre Urban, more than 1 person 
per acre. The metropolitan unions were also treated by them- 
selves The limit 0 3 for rural unions was suggested by the 
density of those agricultural unions the conditions in which 
were investigated by the Labour Commission (the imions of 
Table VII , Chap. IX ) • the average density of these was 0*25, 
and 34 of the 38 were under 0*3 The lower limit of density for 
urban unions — 1 per acre — was suggested by a grouping of Mr 
Booth's (group xiv ) of course 1 person per acre is not a density 
associated with an uiban district in the ordinary sense of the 
teim, but a country district cannot reach this density unless it 
include a small town or portion of a town, z e. unless a large 
proportion of its inhabitants live under urban conditions 

The method by which the relations between four variables are 
discussed is fully described m Chapter XII at the piesent stage 
it can only be stated that the discussion is based on the correlations 
between all the possible (6) pairs that can be formed fiom the four 
variables 

9 Illustration ii — The subject of investigation is the inheritance 
of fertility in naan {Cf Pearson and others, ref 3 ) One table, 
from the memoir cited, was given as an example m the last chapter 
(Table lY). ^ 

^ Fertility in man {% e the number of children bom to a given pan) 
is very largely influenced by the age of husband and wite at 
marriage (especially the latter), and by the duration of maniage 
It is desired to find whether it is also influenced by the heritable 
constitution of the parents, ^ e whether, allowance being made for 
the effect of such distuibmg causes as age and duration of maniage, 
fertility is itself a heritable character 

The effect of duration of marriage may be largely eliminated 
by excluding all marriages which have not lasted, say, 15 years 
at Ir'ast This will rather heavily reduce the irumber of records 
available, but will leave a sufficient number for discussion It 
would be desirable to eliminate the effect of late marriages m 
the same way by excluding all cases in which, say, husband was 
over 30 years of age or wife over 25 (or even less) at the time 
of marriage But, unfortunately, this is impossible ; the age of 
the wife — the most important factor — is only exceptionally given 
m peerages, family histones, and similar works, from which the 
data must be compiled All marriages must therefore be 
included, whatever the age of the parents at marriage, and the 



196 


THEOEY OF STATISTICS. 


effect of the varying age at marriage must be estimated 
afterwards 

10 But the correlation between (1) number of children of a 
woman and (2) number of children of hei daughter will be further 
affected according as we include in the record all her available 
daughters or only one Suppose, e g , the number of children in 
the first generation is 5 (say the mother and her brothers and 
sisters), and that she has three daughters with 0, 2, and 4 
children respectively are we to enter all three pairs (5, 0), 
(5, 2), (5, 4) in the correlation-table, or only one pair If the 
latter, which pair 1 For theoretical simplicity the second process 
IS distinctly the best (though it still further limits the available 
data) If it be adopted, some regular rule will have to be made 
for the selection of the daughter whose fertility shall be entered 
m the table, so as to avoid bias the first daughter married 
for whom data are given, and who fulfils the conditions as to 
duration of marriage, may, for instance, be taken in every case. 
(For a much more detailed discussion of the problem, and the 
allied problems regarding the inheritance of fertility m the horse, 
the student is referred to the original ) 

11. Illustration lii — The subject for investigation is the 
relation between the bulk of a crop (wheat and other cereals, 
turnips and other root crops, hay, etc ), and the weather. [Of 
Hooker, ref 7.) 

Pioduce-statistics for the more important ciops of Great 
Britain have been issued by the Board of Agiiculture since 
1885 the figures are based on estimates of the yield furnished 
by oflQcial local estimators all over the countiy Estimates are 
published for separate counties and for gioups of counties 
(divisions) But the climatic conditions vary so much over the 
United Kingdom that it is better to deal with a smaller area, 
more homogeneous from the meteorological standpoint On the 
other hand, the area should not be too small , it should be large 
enough to present a representative variety of soil. The group 
of eastern counties, consisting of Lincoln, Hunts, Cambridge, 
Norfolk, Suffolk, Essex, Bedford, and Hertford, was selected as 
fulfilling these conditions The group includes the county with 
the largest acreage of each of the ten crops investigated, with 
the single exception of permanent grass. 

12 The produce of a crop is dependent on the weather of 
a long preceding period, and it is naturally desired to find the 
influence of the weather at all successive stages during this 
period, and to determine, for each crop, which period of the 
year is of most critical importance as regards weather. It must 
be remembered, however, that the times of both sowing and 



X. — coreelation: illustrations and methods. 197 


harTest are themselves very largely dependent on the weather, 
and consequently, on an average of many years, the limits of 
the critical period will not be very well defined. If, therefore, 
we correlate the produce of the crop (X) with the characteristics 
of the weather (Y) dunng successive intervals of the year, it 
will be as well not to make these intervals too short. It was 
accordingly decided to take successive groups of 8 weeks, over- 
lapping each other by 4 w-eeks, i e weeks 1-8, 5-12, etc 
Con elation coefficients were thus obtained at 4-weeks intervals, 
but based on 8 weeks^ weather. 

13 It remains to be decided what characteristics of the weather 
are to be taken into account The rainfall is clearly one factor 
of great importance, temperature is another, and these two will 
afford quite enough labour for a first investigation The weekly 
rainfalls were averaged for eight stations within the area, and 
the aveiage taken as the first characteristic of the weather. 
Temperatures were taken from the records of the same stations. 
The average temperatures, however, do not give quite the sort 
of information that is requiied* at tempeiatures below a certain 
limit (about 42° Fahr) there is very little growth, and the 
growth mci eases in rapidity as the temperatuie rises above this 
point (withm limits) It was therefore decided to utilise the 
figures for “accumulated temperatures above 42° Fahr,” le 
the total number of day-degrees above 42° during each of the 
8-weekly periods, as the second characteristic of the weather, 
these “ accumulated temperatures,” moreover, show much largei 
variations than mean temperatures 

The student should lefer to the original for the full dis- 
cussion as to data The method of treating the correlations 
between three variables, based on the three possible correlations 
between them, is described in Chapter XIL 

14 Problems of a somewhat special kind arise when deabng 
with the relations between simultaneous values of two variables 
which have been observed during a considerable period of time, 
for the more rapid movements will often exhibit a fairly close 
consilience, while the slower changes show no similarity The two 
following examples will serve as illustrations of two methods which 
are generally applicable to such cases. 

Illustration iv, — Fig 41 exhibits the movements of (1) the 
infantile mortality (deaths of infants under 1 year of age per lOOO 
births in the same year) , (2) the general mortality (deaths at all 
ages per 1000 living) m England and Wales during the period 
1838-1904 A very cursory inspection of the figure shows that 
when the infantile mortality rose from one year to the next 
the general mortality also rose, as a rule , and similarly, when the 



198 


THEORY OF STATISTICS. 


infantile mortality fell, the general mortality also fell There 
weie, m fact, only five or six exceptions to this rule during the 
wAole peiiod under review The correlation between the annual 
values of the two mortalities would nevertheless not be very high, 
as the general mortality has been falling more or less steadily since 
1875 or thereabouts, while the infantile mortality attained almost 
a record value in 1899 During a long period of time the correla- 
tion between annual values may, indeed, very well vanish, for the 
two mortalities are affected by causes which are to a large extent 
different in the two cases To exhibit, therefore, the closeness of the 
relation between infantile and general mortality, for such causes 
as show marked changes between one year and the next, it will be 
best to proceed by correlating the annual changes^ and not the annual 
values. The work would be ai ranged m the following form (only 
sufficient years being given to exhibit the principle of the process), 
and the correlation worked out between the figures of cols 3 and 5 


1 

Year 

2 

Infantile 
Moitality per 
1000 Births 

3 

Increase or 
Decrease from 
Year befoie 

4 

General 
Mortality per 
1000 living 

6 

Increase or 
Decrease fiom 
Year before 

1838 

159 


22 4 

— 

1889 

151 

-1 

218 

-06 

1840 

154 

-1-3 

22 9 

-fll 

1841 

146 

-9 ! 

21 6 

-1 3 

1842 

152 

+7 

217 

-f-0 1 

1843 

160 

-2 

21 2 

-06 


For the period to which the diagram refers, viz 1838-1904:, the 
following constants were found by this method — 

Infantile mortality, mean annual change - 0 21 
standard deviation 9*63 
General mortality, mean annual change - 0 09 
standaid deviation 1T4 
Coefficient of correlation + 0 77, 

This is a much higher correlation than would arise from the 
mere fact that the deaths of infants form part of the geneial 
mortality, and consequently there must be a high correlation 
between the annual changes m the mortality of those who are over 
and under 1 year of age {Of Exercises 7 and 8, Chap XI ) 

This method, which appears to have been first used by Miss 
Cave and by Mr Hooker independently in the papers cited in 
1 efs 4 and 6, has recently been generalised by ‘‘ Student and 
the theory fully developed by 0. Anderson (c/ refs 13, 14, 15) 
By taking the first differences the influence of the slower changes 
of the two variables with time may not be wholly eliminated, 
but this elimination may be more completely effected by pro- 




X. — CORRELATION: ILLUSTRATIONS AND METHODS. 199 


ceedmg to the second differenceSj e e. by working out the successive 
differences of the differences in col, 3 and m col 5 before corre- 



Fig, 41, — Infantile and General Mortality in England and Wales, 1838-1904, 


lating. It may even be desirable to proceed to third, fourth or 
higher differences before correlating. 



Fig 42. — Mamage-rate and Foreign Trade, England and Wales, 1855-1904 


15 Illustration v. — The two curves of fig 42 show (1) the 
marriage-rate (persons married per 1000 of the population) for 
England and Wales , (2) the values of exports and imports per 
head of the population of the United Kingdom for every year 
fiom 1855 to 1904 Inspection of the diagram suggests a similar 
relation to that of the last example, the one variable showing a 



200 


THEOEY OF STATISTICS. 


rise from one year to the next when the other rises, and a fall 
when the other falls The movement of both variables is, how- 
ever, of a much more regular kind than that of mortality, 
resembling a senes of waves superposed on a steady general 
trend, and it is the “ waves ” m the two variables — the short-period 
movements, nob the slower trends — which are so clearly related 
16 It IS not difficult, moreover, to sepaiate the short-peiiod 
oscillations, moie or less approximately, from the slow^er movement. 
Suppose the marriage-iate for each year leplaced by the average 
of an odd number of years of which it is the centre, the number 
being as near as may be the same as the period of the waves ” — 
e g, nine years If these short-period averages were plotted on 
the diagram instead of the rates of the individual years, w^e should 
evidently obtain a smoother curve which would clearly exhibit 
the trend and be practically free from the conspicuous waves. 
The excess or defect of each annual rate above or below the 
trend, if plotted separately, would therefore give the “ waves ” 
apart from the slower changes The figures for foreign trade 
may be treated m the same way as the mairiage-rate, and we 
can accordingly work out the correlation between the waves or 
rapid fluctuations, undisturbed by the movements of longer period, 
however great they may be The arithmetic may be carried out 
m the form of the following table, and the correlation worked out 
in the ordinary way between the figures of columns 4 and 7 


1. 

Year 

2 

Marnage-rate 
(England 
and Wales) 

3 

Nine 

Years’ 

Average 

4 

Differ- 

ence 

6 

Expoits+Im- 
ports, je’s per 
head (U K ) 

6 

Nine 

Years’ 

Average 

7 

Differ- 

ence 

1866 

16 2 



9 36 




1866 

16 7 





1H4 




1857 

16 6 

— 



11 86 



1S58 

16 0 

— 



10 73 




1859 

17 0 

16 6 

-f-0 6 

11 72 

12 16 

-0 43 

1860 

171 

16*6 

+0 5 

13 03 

12 94 

+0 09 

1861 

16’3 

16 7 

-04 

13 01 

13 52 

-0 61 

1862 

161 

16-8 

-07 

I 13 40 

14 17 

-0 77 

1863 

16 8 

16*9 

-01 

16 13 

14*81 

+0 32 

1864 

17 2 

— 

— 

16 43 



1865 

17 6 

— 



16 37 




1866 

17*5 

— 



17 72 




1867 

16 6 



16 47 


— 


17. Fig. 43 is drawn from the figuies of columns 4 and 7, and 
shows very well how closely the oscillations of the marriage-rate 
are related to those of trade. For the period 1861-95 the 
correlation between the two oscillations (Hooker, ref 5) is 0 86 
The method may obviously be extended by correlating the devia 




X —CORRELATION . ILLUSTRATIONS AND METHODS 201 


tion of the marriage-rate in any one year with the deviation of 
the exports and imports of the year before, or two years before, 
instead of the same year , if a sufl&cient number of years be 
taken, an estimate may be made, by interpolation, of the time- 
difference that would make the correlation a maximum if it were 
possible to obtain the figures for exports and imports for periods 
other than calendar years Thus Mr Hooker finds (ref 5) that 
on an average of the years 1861-95 the correlation would be a 
maximum between the marriage-rate and the foieign trade of 
about one-third of a year earlier The method is an extremely 
useful one, and is obviously applicable to any similar case. The 


weo 





Fig. 43. — Fluctuations xn (1) Marnage-rate and (2) Foreign Trade (Exports 
+ Imports per head) m England and Wales . the Curves show Deviations 
from 9-year means. Data of R H. Hooker, Jout, Roy Stat Soc , 1901. 


student should refer to the paper by Mr Hooker, cited Reference 
may also be made to ref 10, in which several diagrams are given 
similar to fig 43, and the nature of the relationship between the 
marriage-rate and such factors as trade, unemployment, etc., is 
discussed, it being suggested that the relation is even more 
complex than appears from the above. The same method of 
separating the short-period oscillations was used at an earlier 
date by Poyntmg in ref. 16, to which the student is referred 
for a discussion of the method. 

18 It was briefly mentioned in § 9 of the last chapter that 
the treatment of cases when the regression was non-linear was, 
in general, somewhat difficult Such cases lie strictly outside 
the scope of the present volume, but it may be pointed out 
that if a relation between X and Y be suggested, either by 



202 


THEORY Of STATISTICS. 


theoiy or by previous experience, it may be possible to throw 
that relation into the form 

Y==A + B<jy(X), 

where A and B are the only unknown constants to be determined. 
If a corielation-table be then drawn up between Y and ^{X) 
instead of Y and X, the regression will be approximately linear 
Thus m Table Y. of the last chapter, if X be the rate of 
discount and Y the percentage of leserves on deposits, a 
diagram of the curves of i egression, or curves on which the 
means of arrays lie, suggests that the relation between X and Y 
IS approximately of the form 

X(Y-B)^A, 

A and B being constants , that is, 

XY^A+BX 

Or, if we make XT’ a new vaiiable, say X, 

X=^ + J5X 

Hence, if we draw up a new correlation-table between X and Z 
the regression will probably be much more closely linear 
If the relation between the variables be of the form 


Y^AB^ 

we have 

log F = log ^ + X. log B, 

and hence the relation between log F and X is linear. Similarly, 
if the 1 elation be of the form 


we have 


X"F=^ 

log F = log ^ log X, 


and so the relation between log F and log X is linear By 
means of such artifices for obtaining correlation- tables in 
which the regression is linear, it may be possible to do a good 
deal in difficult cases whilst using elementary methods only 
The advanced student should refer to ref 17 for a different 
method of treatment 

19 The only strict method of calculating the correlation 
coefficient is that described in Chapter IX from the formula 

, Approximations to this value may, however, be 



X — COREELATION- ILLUSTRATIONS AND KETHODS. 203 


found m various ways, for the most part dependent either (1) 

on the formulse for the two regressions r— and or (2) on 



the formulae for the standard deviations of the arrays -r^ 

and cTy Jl Such approximate methods are not lecommended 

for ordinary use, as they will lead to different results in different 
hands, but a few may be given here, as being occasionally useful 
for estimating the value of the correlation m cases where the 
data are not given in such a shape as to permit of the proper 
calculation of the coefficient 

(1) The means of rows and columns are plotted on a diagram, 
and lines fitted to the points by eye, say by shifting about 
a stretched black thread until it seems to run as near as may 
be to all the points If 5^, be the slopes of these two lines 
to the vertical and the horizontal respectively, 

r= 

Hence the value of r may be estimated from any such diagram 
as figs 36-40 m Chapter IX , in the absence of the original 
table Further, if a corielation-table be not grouped by 
equal intervals, it may be difficult to calculate the product 
sum, but it may still be possible to plot approximately a diagram 
of the two lines of regression, and so determine roughly the 
value of r Similarly, if only the means of two rows and 
two columns, or of one row and one column m addition to the 
means of the two variables, are known, it will still be possible 
to estimate the slopes of HE and CC, and hence the correlation 
coefficient. 

(2) The means of one set of arrays only, say the rows, are 
calculated, and also the two standaid-deviations cr^ and o-y. The 
means are then plotted on a diagram, using the standard-deviation 
of each vaiiable as the unit of measurement, and a line fitted by 
eye The slope of this line to the vertical is r If the standard 
deviations be not used as the units of measurement m plotting, 
the slope of the line to the vertical is r crJoTy^ and hence r will be 
obtained by dividing the slope by the ratio of the standard- 
deviations. 

This method, or some variation of it, is often useful as a 
makeshift when the data are too incomplete to peimit of the 
proper calculation of the correlation, only one line of regression 
and the ratio of the dispersions of the two variables being required 
the ratio of the quartile deviations, or other simple measures of 
dispersion, will serve quite well for rough purposes in lieu of the 
ratio of standard-deviations. As a special case, we may note that 



204 


^THEOE? OF STATISTICS 


if the two dispersions are approximately the same, the slope of 
RB to the vertical is r. 

Plotting the medians of arrays on a diagram with the quartile 
deviations as units, and measuring the slope of the line, was the 
method of determining the coiielation coefficient (“Galton^s 
function used by Sir Francis Galton, to whom the introduction 
of such a coefficient is due (Refs 2-4 of Chap IX p. 188 ) 

(3) If be the standard-deviation of errois of estimate like 
we have from Chap IX § 11 — 

and hence 



But if the dispel sions of arrays do not differ largely, and the 
regression is nearly linear, the value of may be estimated from 
the average of the standard-deviations of a few rows, and r deter- 
mined — or rather estimated — accordingly Thus in Table III , 
Chap IX., the standard-deviations of the ten columns headed 
62 5-63*5, 63-5-64 5, etc , aie — 


2 56 

2 26 

2-11 

2 26 

2*55 

2 45 

2 24 

2 33 

2 23 


2 60 

Mean 2 359 


The standard-deviation of the stature of all sons is 2 75- hence 
approximately 



-0*514. 


This is the same as the value found by the product-sum method 
to the second decimal place It would be better to take an 
average by counting the square of each standard-deviation 
once for each observation m the column (or “ weighting ” 
it with the number of observations in the column), but in the 
present case this would only lead to a very slightly different 
result, VIZ 5 -2 362, r-0 512. 

20 The Correlation Ratio — The method clearly would not 
give an approximation to the correlation coefficient, however, in 
the case of such tables as V and YI. of Chap IX , m which the 
means of successive arrays do not lie closely round straight lines 



X. — CORRELATION: ILLUSTRATIONS A3«) METHODS. 205 


111 such cases it would always tend to give a value for r markedly 
higher than that given by the product-sum method. The 
product-sum method gives in fact a value based on the standard- 
deviation round the line of regression ; the method used above 
gives a value dependent on the standard-deviation round a line 
which sweeps through all the means of arrays, and the second 
standard-deviation is necessarily less than the first. We reach, 
therefore, a generalised coefficient which measures the approach 
towards a curvilinear Ime of regression of any form 

Let Soj. denote the standard-deviation of any array of JTs, and 
let Tiy as before, be the number of observations in this array (Chap. 
IX , § 11), and further let 


. . . . ( 1 ) 

Then cr^ is an average of the standard-deviations of the arrays 
obtained as suggested at the end of the last section Now let 


or 


= .... ( 2 ) 


Then rjgy is termed by Professor Pearson a correlation-ratio (ref. 
18). As there are clearly two correlation-ratios for any one table, 
it should be distinguished as the correlation-ratio of X on Y\ it 
measures the approach of values of X associated with given 
values of F to a single-valued relationship of any form The 
calculation would be exceedingly laborious if we had actually to 
evaluate but this may be avoided and the work greatly 
simplified by the following consideration If denote the mean 
of all Xs, the mean of an array, then we have by the general 
relation given in § 11 of Chap. VIII. (p 142) 


+ \M^ - 

Or, using cr^ to denote the standaid-deviation of , 
Hence, substituting m (3) 



■ (0 
. ( 5 ) 


The correlation-ratio of X on Y is therefore determined when we 
have found, m addition to the standard-deviation of X, the 
standard-deviation of the means of its arrays. 

21 The correlation-ratio of X on F cannot be less than the 
coi relation-coefficient foi X and F, and is a measure of 

the divergence of the regression of X on F from linearity. For 



206 


THEORY OF STATISTICS 


if d denote, as in Chap IX, the deviation of the mean of an 
array of X’s from the line of regression, we have by the relation 
of Chap IX., § 11, p 172 

• . • • (^) 

Substituting for cTa^ from (2), that is, 

O'd ^ . ( 7 ) 

But cr^ is necessarily positive, and therefore is not less than r 
The magnitude of and therefore of measuies the 

divergence of the actual line through the means of arrays from 
the line of regression- 

It should be noted that, owing to the fluctuations of sampling, 
r and ri are almost certain to differ slightly, even though the 
regression may be truly linear The observed value of 
must be compared with the values that may arise owing to 
fluctuations of sampling alone, before a definite significance can 
be ascribed to it (c/ Pearson, ref. 19, Blakeman, ref 22, and the 
formulae cited therefrom on p. 352 below) 

22 The following table illustrates the form of the arithmetic 
for the calculation of the correlation-ratio of son’s stature on 
father’s stature (Table III of Chap IX, p 160) In the first 
column is given the type of the array (stature of father) , m the 
second, the mean stature of sons for that array , m the third, the 
difference of the mean of the array fiom the mean stature of all 
sons. In the fourth column these differences aie squared, and in 
the sixth they are multiplied by the frequency of the an ay, two 
decimal places only having been retained as sufficient foi the 
present purpose The sum total of the last column divided by 
the number of obsei vations (1078) gives cr^ny^ = 2 058, or cr^y — 1 43. 
As the standard-deviation of the sons’ stature is 2 76 in. (cf. 
Chap IX, question 3), 7)y^=0 62. Before taking the differences 
for the third column of such a table, it is as well to check the 
means of the arrays by recalculating from them the mean of the 
whole distribution, t e multiplying each array-mean by its fre- 
quency, summing, and dividing by the number of observations 
The form of the aiithmetic may be varied, if desired, by working 
from zero as oiigin, instead of taking differences from the true 
mean The square of the mean must then be subtracted from 

Kf V)/-^ 

If the second conelation-iatio for this table be worked out m 
the same way, the value will be found to be the same to the 
second place of decimals the two correlation-ratios for this table 
are, therefoie, veiy nearly identical, and only slightly gi eater 
than the correlation-coefficient (0 51) Both regressions, it 



X. — CORRELATION : ILLUSTRATIONS AND METHODS. 207 


follows from the last section, are very nearly linear, a result 
confirmed by the diagram of the regression lines (fig 37, p 174) 
On the other hand, it is evident fiom fig 39, p 176, that we 
should expect the two correlation-ratios for Table YI of the same 
chapter to differ considerably from each other and from the correla- 
tion The values found are ,„ = 014, v = 0 38 (r=-0 014)- 
IS comparatively low as proportions of male births differ little 
m the successive arrays, but 7}y^ is higher since the line of regres- 
sion of r on X IS sharply cuived For Table YIII., p. 183, the 
two ratios are 46, = 0 39 (r = 0 34) The confirmation 

of these values is left to the student 

The student should notice that the correlation-ratio only 
affords a satisfactory test when the number of obseivations m 
sufficiently large for a grouped con elation table to be formed 
In the case of a short series of observations such as that given in 
Table VII , p. 178, the method is inapplicable 

Calculation of ihe Coreelation-Ratio Emmple — Soii'b Stahtre on 
Father* s Stature Data of Table III , Chap IX , p IbO 


1. 

Type of 
An ay 
(Father's 
Stature). 

2 

Mean of 
Array 
(Son’s 
Stature) 

3 

Diffeience 
from Mean 
of all Sons 
(68 66) 

4. 

Squaie of 
Ditference 

5 

Frequency 

6 

Fiequency x 
(difierence/ 

59 

64 67 

-3 99 

15 9201 

3 

47 76 

60 

65 64 

-3 02 

9 1204 

3 5 

31 92 

61 

66 34 

-2 32 

5 3824 

8 

43 06 

62 

65 56 

-3 10 

9 6100 

17 

163 37 

63 

66 68 

-1 98 

3 9204 

33 5 

131 33 

64 

66 74 

-1 92 

3 6864 

61 5 

2‘26 71 

65 

67 19 

-1 47 

2 1609 

95 5 

206 37 

66 

67 61 

-1 05 

1 1025 

142 

156 56 

67 

67 95 

-0 71 

0 5041 

1 137 5 

69 31 

68 

69 07 

-1-0 41 

0 1681 

154 

25*89 

69 

69 39 

+ 0 7Z 

0 5329 

141 5 

76 41 

70 

69 74 

+ 1 08 

1 1664 

116 

135 30 

71 

70 50 

-f 1 84 

3 3856 

78 

264 08 

72 

70 87 

+ 2 21 

4 8841 

49 

239*32 

73 

72 00 

+ 3 34 

11 1556 

28 5 

317 93 

74 

71 50 

-i-2 84 

8 0656 

4 

32 26 

75 

71 73 

-i-3 07 

9 4249 

5 5 

51*84 

Total 



... 

1078 

2218 42 


fl-,ny- = 2218 42/1078 = 2 058 <rmy^l 43 

= 1 43/2 76 = 0 52 




208 


THEORY OP STATISTICS. 


REFERENCES. 

Illustrative AppUcations, principally to Economic Statistics, 
and Practical Methods 

(1) Yule, G U , “ On the Correlation of total Pauperism with Proportion ot 

Out-relief,” Economic Jour,, vol v., 1895, p 603, and vol vi., 1896, 
p 613. 

(2) Yule, G XT , An Investigation into the Causes of Changes in Pauperism 

in England chiefly during the last two Intercensal Decades,” Jour 
Roy Slot Soc , vol Ixii , 1899, p 249. (^Cf Illustration i ) 

(3) Peaeson, Kael, Alice Lee, and L Beamlet Mooef, “Genetic 

(lepioductive) Selection Inheiitance of Fertility in Man and of 
Fecundity in Thoroughbred Racehorses,” Rhil, Trans Roy. Soc , Series 
A, vol cxcii , 1899, p. 257 {Cf Illustration ii ) 

(4) Cave-Beowne-Cave, F E , “On the Influence of the Time-factor on 

the Correlation between the Barometric Heights at Stations moie than 
1000 miles apart,” Rroc Roy Soc.^ vol. Ixxiv , 1904, pp. 403-413 
(The difference-method of Illustration iv used ) 

(6) Hookee, R H., “ On the Correlation of the Marriage-rate with Trade,” 
Jour Roy Stat, Soc , vol Ixiv , 1901, p. 486 (The method of 
Illustration v ) 

(6) Hookee, R H., “On the Correlation of Successive Observations illus- 

trated by Corn-prices,” , vol Ixviii., 1905, p 696 (The method 
of Illustration iv ) 

(7) Hookee, R H , “ The Correlation of the Weather and the Crops,” ibid., 

vol Iax., 1907, p. 1. (Cf Illustration iii ) 

(8) Norton, J. P, Statistical Studies m the New York Money Market, 

Macmillan Co , New York, 1902 (Applications to financial statistics . 
an instantaneous averagCjmethod, analogous to that of illustration v , is 
employed, but the instantaneous average is obtained by an interpolated 
logarithmic curve ) 

(9) March, L, “Compaiaison numerique de courbes statistiques,” Jour. 

de la societi de statistique de Paris, 1905, pp 255 and 306 (Uses the 
methods of Illusti ations iv and v , but obtaining the instantaneous 
average in the latter case by giaphical interpolation.) 

(10) Yule, G U, “On the Changes in the Mariiage and Birth Rates m 

England and Wales duiing the past Half Century, with an Inquiry as 
to their probable Causes,” Jour Roy Stat Soc , vol. Ixix , 1906, p. 88. 

(11) Heron, D , On the Relation of Fertility in Man to Social Status, 

“ Drapers’ Co Research Memoirs Studies m National Deterioration,” 
I , Dulau k Co., London, 1906. 

(12) Jacob, S M,, “On the Correlations of Areas of Matured Crops and the 

Rainfall,” Mem Asiatic Soc Bengal, vol. ii , 1910, p 847 

(13) “ Student,” “The Elimination of Spurious Correlation due to Position 

in Time oi Space,” Biometrika, vol x., 1914, pp 179-180 (The 
extension of the diference-method by the use of successive differences.) 

(14) Anderson, 0 , “Nochmals uber ‘The Elimination of Spurious Correla- 

tion due to Position in Time or Biometrika, vol. x , 1914, 

pp. 269-279. (Detailed theory of the same extended method ) 

(15) Cave, Beatrice M., and Karl Peaeson, “Numerical Illustrations of 

the Yanate-difference Correlation Biometnka, vol. x , 1914, 

pp. 340-356. 

(16) PoYNTiNG, J. H., “A Comparison of the Fluctuations in the Price of 

Wheat, and in the Cotton and Silk Imports into Great Britain,” Jour. 



X. — COKEELATION: ILLUSTRATIONS AJJD METHOD. 209 

Boy Skit Soc j voL zlvii., 1884, p. 34* (This paper was written 
before the invention of the correlation coefBcient, but is cited because 
the method of Illustration v is used to separate the periodic from the 
secular movement . see especially § ix on the process of averaging 
employed.) 

Theory of Correlation in the case of Fon-lmear Eegression, 
and Curve or Line fitting generally. 

(17) Peaeson, Earl, “On the Systematic Fitting of Curves to Observations 

and Measurements,” Biometrika^ vol. i. p 265, and vol, ii p. 1, 1902, 
(The second pait is useful for the fitting of curves in cases of non^lmear 
regression } 

(18) Pearson, Kael, Oh the General Theory of Skew Correlation and Non- 

linear Ilegression, “ Drapers’ (^o. Research Memoirs • Biometric Senes,” 
II. , Dulau & Co , London, 19054 (The “correlation ratio ”) 

(19) Peaeson, Kael, “ On a Correction to be made to the Oorrelation Ratio,” 

Bwmetrika^ vol. viu , 1911, p. 254. 

(20) Peaesok, Kael, “ On Lines and Planes of Closest Fit to Systems of 

Points m Space,” 6th Senes, vol ii , 1901, p 559. 

(21) Peaeson, Kael, “On a General Theory of the Method of False 

Position,” Bhil Mag , June 1903. (A method of curve fitting by 
the use of trial solutions ) 

(22) Blakeman, J , “On Tests for Lmeanty of Regression in Frequency- 

distnbutions,” Biomeinka^ vol iv , 1905, p 332 

(23) Snow, E 0 , “ On Restricted Lines and Planes of Closest Fit to Systems 

of Points in any number of Dimensions,” Phil Mag,^ 6th Senes, vol 
XXI , 1911, p 367. 

(24) Slutsky, E., “On the Criterion of Goodness of Fit of the Regression 

Lines and the best Method of Fittmg them to the Data,” Jowr, Roy 
Stat Soc , vol Ixxvii , 1913, pp. 78-84. 

Abbreviated Methods of Galculatioiu 

See also refeiences to Chapter XVI 

(25) Haeeis, J. Arthur, “ A Short Method of Calculating the Coefficient 

of Correlation in the case of Integral Variates,” Biometrika^ vol. vu., 
1909, p 214, (Kotan approximation, but a true short method ) 

(26) Haeeis, J. Aethue, “On the Calculation of Intra-class and Inter-class 

Coefficients of Corielation from Class-moments when the Number of 
possible Combinations is large,” Biometnka^ vol ix , 1914, pp, 
446-472. 


14 



CHAPTEE XI. 

MISCELLAlilEOirS THEOEEMS INVOLVING THE USE OF 
THE COERELATION-OOEFFIOIENT. 


1 Introductory — 2. Standard-deviation of a sum or difference — S-5. In- 
fluence of errors of observation and of grouping on the standard- 
deviation — 6-7 Influence of errors of observation on the correlation- 
coefficient (Spearman’s theorems) — 8 Mean and standard-deviation 
of an index — 9. Correlation between indices — 10 Con elation- 
coefficient for a two- X two-fold table— -11. Correlation-coefficient 
for all possible pairs of N values of a variable — 12 Correlation due 
to heteiogeneity of mateiial — 13. Reduction of correlation due to 
mingling of uncoirelated with correlated material — 14-17 The 
weighted mean — 18-19 Application of weighting to the correction 
of death rates, etc , for varying sex and age-distiibutions — 20 The 
weighting of forms of average other than the arithmetic mean. 

1 It has already been pointed out that a statistical measure, if 
it IS to be widely useful, should lend itself readily to algebraical 
treatment The arithmetic mean and the standard-deviation 
derive their importance largely from the fact that they fulfil this 
requirement better than any other averages or measures of dis- 
persion , and the following illustrations, while giving a number of 
results that are of value in one branch or another of statistical 
work, suffice to show that the correlation-coefficient can be treated 
with the same facility This might mdeed be expected, seeing 
that the coefficient is derived, like the mean and standard-devia- 
tion, by a straightfoiward process of summation 

2 To find the Standard-deviation of the sum or difference Z of 
corresponding values of two variables and X^ 

Let z, iCp denote deviations of the seveial variables from 
their arithmetic means Then if 

Z^X^ ± 

Z^X-y± 

210 


evidently 



XI — CORRELATION: MISCELLANEOUS THEOREMS, 211 


Squaring both sides of the equation and summing, 

2 ( 3 ^) = ± 

That IS, if r be the correlation between and and o*, 
the respective standard-deviations, 

cr2=-cri2q.a-2^±2rcricr, . . . (1) 

If and are uncorrelated, we have the important special case 

(T^ = cr^- -f* CTo- . . . (2) 

The student should notice that m this case the standard- 
deviation of the sum of coi responding values of the two variables 
IS the same as the standard-deviation of their diffeience 

The same process will e\idently give the standaid-deviation of a 
linear function of any number of variables For the sum of a 
series of variables X^, . X„ we must have 

0-- = -i- 0-2*^ 4- .... -f (r„^-|-2rj2 cTj 0*2 + 2/^3 tr^o-g 

-f 23 “ • 

being the correlation hew een and Xg, the correlation 
betv\ een X 2 and and so on. 

3 Influence of Errors of Observation on the Standait d-deviation, 
— The results of § 2 may be applied to the theory of errors of 
observation Let us suppose that, if anr/ value of X be observed 
a large number of times, the arithmetic mean of the observations 
is approximately the true value, the arithmetic mean error being 
zero. Then, the arithmetic mean error being zero for all values 
of X, the error, say 8, is uncorrelated with X. In this case if x-^ be 
an observed deviation from the arithmetic mean, x the true devia- 
tion, we have from the preceding 

crj,* = o-/ + (r6® . . . . (3) 

The effect of errors of observation is, consequently, to increase the 
standard-deviation above its true value The student should 
notice that the assumption made does not imply the complete in- 
dependence of X and 8 he is quite at liberty to suppose that 
errors fluctuate more, for example, with large than with small 
values of X, as might very probably happen In that case the 
contmgency-coefficient between X and 8 would not he zero, 
although the correlation-coefficient might still vanish as supposed 
4. Influence of Grouping on ihti Standard-deviation — The 
consequence of grouping observations to form the frequency 
distribution is to introduce errors that are, in effect, errors of 



212 


THEORY OR STATISTICS. 


measmement Instead of assigning to any observation its tnie 
value Xj we assign to it tbe value Xj corresponding to the centie 
of the class-interval, thereby making an enor S, where 

Xi = X+8 

To deduce from this equation a formula showing the nature of 
the influence of grouping on the standard-deviation we must know 
the correlation between the error 8 and X or X^ If the original 
distribution were a histogram, X^ and 8 would be uncorrelated, 
the mean value of 8 being zero for every value of X^ further, the 
square of the standaid-deviatioh of 8 would be wdieie c is 

the class-mteival (Chap YIII ^12, eqn (10)) Hence, if cr-^ be the 
standaid-deviation of the grouped values X^ and cr the standard- 
deviation of the true values X, 


But the true frequency distribution is mrely or never a 
histogram, and tiial on any fieqiiency distribution approximating 
to the symmetrical or slightly asymmetrical foims of fig. 5, p 89, 
or fig* 9 (a), p 92, shows that grouping tends to increase rather 
than reduce the standard-deviation If we assume, as m § 3, that 
the coi relation between 8 and X, instead of 8 and X;^, is appreciably 
zero and that the standard-deviation of 8 may be taken as c2/12, 
as before (the values of 8 being to a first appioximation unifoimly 
distributed over the class-interval when all the mteivals are 
oonsideied together), then we have 


This is a formula of coirection for giouping (Sheppard's collec- 
tion, lefs, 1 to 4) that is very frequently used, and that tiial 
(ref. 1) shows to give very good results for a curve approximating 
closely to the form of fig 5, p 89. The stiict proof of the 
formula lies outside the scope of an elementary work • it is based 
on two assumptions (1) that the distiibution of frequency is 
continuous, (2) that the fiequency tapers off gradually to zero 
in both diiections. The formula would not give accuiate results 
m the case of such a distribution as that of fig 9 (^), p 92, or 
fig 14, p 97, neithei is it applicable at all to the moie divergent 
foims such as those of figs 15, et seq. 

5 If certain observations be repeated so that we have m every 
case two measures and x^ of the same deviation x^ it is possible 
to obtain the true standard-deviation if the further assumption 
IS legitimate that the errors 8^ and Sg are uncorrelated with each 
other. On this assumption 



XI — COEKlIiATION ; MISCELLANBOUS THEOREMS. 213 


and accordingly 


= S{a: + 8j)(a: + S^) 

= 2(x^), 

CTx - - • 


(5) 


(This foimula is part of Spearman^s formula for the correction of 
the coirelation-eoefficientj cf. § 7 ) 

6 Injl mnce of B'i'rors o f Ohsei vation on the < elation-tot fflcien t. 
—Let (Tp be the observed deviations from the arithmetic means, 
X, y the true deviations, and 8 , e the errors of observation. Of 
the four quantities r, 8 , € we will suppose x and y alone to 
be correlated. On this assumption 

2 (*iyi) = 5(ary) . . . , ( 6 ) 

It follows at once that 

fjni _ O' /, 

O't/ ’ 

and consequently the obseived ton elation is less than the true 
correlation This difiference, it should be noticed, no meie increase 
in the number of obseivations can in any way lessen 

7 Speai'marfs Theorems — If, however, the observations of both 
X and y be repeated, as assumed in § 5, so that we have two 
measures and a? 2 , Vi and of every value of x and y, the true 
value of the correlation can be obtained by the use of equations 
(5) and ( 6 ), on assumptions similar to those made above. For 
we have 

3 _ S(^i.yi)S(^ 2 y 2 ) _ 

Or, if we use all the four possible correlations between observed 
values of x and observed values of j/, 


^ «lgl * ^2^2*^ xm ’ X'iHi 


Equation ( 8 ) is the original form in which Spearman gave his 
correction formula (refs. 6 , 7) It will be seen to imply the 
assumption that, of the six quantities ic, y^ 8 ^, 83 , eg, x and y 
alone are correlated The correction given by the second part 
of equation (7), also suggested by Spearman, seems, on the 



214 


THEORY OF STATISTICS 


\\hoIej to be safer, for it eliminates the assumption that the errors 
m X and in in the same series of observations, are uncon elated 
An insufficient though partial test of the coriectness of the 
assumptions may be made by correlating - x^ with - y^ this 
correlation should vanish Evidently, however, it may vanish 
from symmetry without theieby implying that all the correlations 
of the errors aie zero 

8 Mean and Standard-deviation of an Index — (Kef 11) The 
means and standard-deviations of non-lmear functions of two or 
moie variables can in general only be expressed m teims of the 
means and standard-deviations of the original variables to a hist 
approximation, on the assumption that deviations are small 
compared with the mean values of the variables Thus let it be 
required to find the mean and standard-deviation of a ratio or 
index m terms of the constants for X^ and Xg. Let I 

be the mean of Z, Mj and the means of Xj^ and X^ Then 


/- 




Expand the second bracket by the binomial theorem, assuming 
that X 2 /M 2 is so small that powers highei than the second can 
be neglected Then to this approximation 


That IS, if r be the correlation between x-^ and x^^ and if v^ = 

M 

• • ■ (9) 

If s be the standard-deviation of Z we have 


8^ + P: 




'N 




Expanding the second biacket again by the binomial theorem, 
and neglecting terms of all orders above the second. 


«2 + /2 = 


1 J/2 




yl/2 


1 - 2 ^2 




-f- 3 


Mi) 



XL — COERELATION. MISCELLANEOUS THEOREMS. 215 


or from (9) 


Mi' 




. ( 10 ) 


9 Correlation between Indices — (Ref. 11) The following prob- 
lem affords a further illustration of the use of the same method. 
Required to find approximately the con elation between two ration 
— Xj/Xq, X = Xq/X^, Xj Xo and Xg being uncorrelated. 

Let the means of the two ratios or indices be In and the 
standard-deviations these are given approximately by (9) 

and (10) of the last section The required correlation /> will be 
given by 






■) 


if/,/. 


i/,j/,, 

■ J /„2 - 




XT,!, 


Neglecting teims of higher order than the second as before and 
remembering that all correlations are zero, we have 


' Mi 3 > 


where, in the last step, a term of the order has again been 
neglected. Substituting from (10) for and Sg, we have finally — 






. ( 11 ) 


This value of p is obviously positive, being equal to 0 5 if 
= Vg ; and hence even if X^ and Xg are independent, the in- 
dices formed by taking their ratios to a common denominator Xg will 
be correlated The value of p is termed by Professor Pearson the 
“ spurious correlation ” Thus if measurements be taken, say, on 
three bones of the human skeleton, and the measurements grouped 
in threes absolutely at random, there will, nevertheless, be a 
positive correlation, probably approaching 0*5, between the indices 
formed by the ratios of two of the measuiements to the third To 
give another illustiation, if two individuals both observe the same 
series of magnitudes quite independently, there may be little, if 



216 


THEORY OF STATISTICS. 


any, correlation between their absolute errors But if the errors 
be expiessed as percentages of the magnitude obseived, there 
may be considerable correlation It does not follow of necessity 
that the correlations between indices or ratios are misleading. 
If the indices are uncorrelated, there will be a similar spurious ’’ 
correlation between the absolute measurements — and 

^ 2-^3 = -^25 answer to the question whether the correlation 

between indices or that between absolute measures is misleading 
depends on the further question whether the indices or the 
absolute measures are the quantities diiectly determined by the 
causes under investigation {cf ref. 13). 

The case considered, where X^ X^ X^ are uncorrelated, is only 
a special one ; for the general discussion cf ref. 1 1 . For an in- 
teresting study of actual illustrations cf, ref. 14. 

10. Correlation-coefficient fo7' a two- x twofold Table — The 

correlation-coefficient is m general only calculated for a table with 
a considerable number of rows and columns, such as those given 
in Chapter IX In some cases, however, a theoretical value is 
obtainable for the coefficient, which holds good even for the limiting 
case when there are only two values possible for each variable {e g 
0 and 1) and consequently two rows and two columns {cf one illus- 
tration in § 11, and for others the references given in questions 11 
and 12) It is therefore of some interest to obtain an expiession 
for the coefficient in this case in terms of the class-frequencies. 
Using the notation of Chapters I.-IV the table may be written 
in the form 


Values of 
Second 
Variable 

Values of First Variable 

X, 

X', 

Total 

Xa 

{AB) 

(aB) 

(5) 

X', 

im 

(a^) 

(B) 

Total 

U) 

(a) 

N 


Taking the centre of the table as arbitrary oiigin and the 
class-interval, as usual, as the unit, the co-ordinates of the 
mean are 




XI — gobeelation: miscellaneous theoebms. 217 


The standard-de\ lations o-j, tT„ are given by 
cri2 = 0-25-|2 = (^)(a)/ii"s 


Finally, 

Writing 


<T22 = 0-25-^ = (^)(j8)/m 
•lixy) = \{(AB) + (a/3) - {Afi) - (aB ) } - 
{AB)-{A){B)IF^S 


(as m Chap III. §§ 11-12) and replacing rj by their values, 
this reduces to 

2(iry) = 8 

Whence 


W.S 

J{AMMF) 


. ( 12 ) 


This value of r can be used as a coefficient of association, but, 
unlike the association-coefficient of Chap III § 13, which is 
unity if either (AJB) — {A) or {AB)-(B), r only becomes unity if 
(AB) = (il) = (B) This IS the only case in which both frequencies 
(aB) and (Af^) can vamsh so that (AB) and {a/S) correspond to 
the frequencies of two points Xj Xg Fg on a line. Obviously 
this alone renders the numerical values of the two coefficients 
quite incomparable with each other. But further, while the 
association coefficient is the same for all tables derived from one 
another by multiplying rows or columns by arbitrary coefficients, 
the correlation coefficient (12) is greatest when (A) — (a) and 
(B) = (0)j Le, when the table is symmetrical, and its value is 
lowered when the symmetrical table is rendeied asymmetrical by 
increasing or reducing the number of A^^ or j5's. For moderate 
degrees of association, the association coefficient gives much the 
larger values The two coefficients possess, in fact, essentially 
different properties, and are different measures of association in 
the same sense that the geometric and arithmetic means are 
different forms of average, or the interquartile range and the 
standard-deviation different measures of dispersion 

The student is again referred to ref. 3 of Chap III. for a 
general discussion of various measures of association, including 
these and others, that have been proposed. 

11 The Correlation-coefficient for M possible pairs of JT values 
of a Yanahle — In certain cases a correlation-table is formed by 
combining Tf observations in pairs in all possible ways If, for 
example, a table is being formed to illustrate, say, the correlation 
between brothers for stature^ and there are three brothers in 



218 


THEORY OF STATISTICS. 


one family with statures 5 ft. 9, 5 ft 10, and 5 ft. 11, these aie 
regarded as giving the six pairs 

5 ft 9 with 5 ft, 10 5 ft 10 with 5 ft. 9 

„ „ 5 ft 11 5 ft 11 „ 

5 ft 10 „ „ „ „ 5 ft. 10 

which may be enteied into the table The entire table will be 
formed from the aggregate of such subsidiary tables, each due to 
one family Let it be required to find the correlation-coefficient, 
however, for a single subsidiary table, due to a family with N 
members, the numbers of pairs being therefore - 1) 

As each observed value of the variable occurs AT- 1 times, 
i,e once m combination with every other value, the means and 
standard-deviations of the totals of the correlation-table are the 
same as for the original N observations, say M and cr. If x^ 
x^ .be the observed deviations, the product sum may be 
written 


. (13) 

For iT™ 2, 3, 4 . . this gives the successive values of - 1, 
- -I, - J It IS clear that the first value is right, for two 

values ajj, x^ only determine the two points x^ and (aig, x^^ 
and the slope of the line joining them is negative. 

The student should notice that a corresponding negative 
association will arise between the first and second member of the 
pair if all possible pairs are formed in a mixture of A’s and a’s 
Looking at the association, in fact, from the standpoint of § 10, 
the equation (13) still holds, even if the vaiiables can only assume 
two values, e,g 0 and 1 This result is utilised in § 14 of Chapter 
XIY. 

12 Correlation due to Heterogeneity of Material . — The following 
theorem offers some analogy with the theorem of Chap. IV. 
§ 6 for attributes — If X and Y are uncorrelated in each of two 
records^ they will nevertheless exhibit some correlation when the 


^1^2 + 'Vs + ‘^^4 + . • • • 

-f“ Xt^j-^ "{“ XqX^ -j- X^X^ “h . » • • 

X^X2 + X^X^’^ .... 

+ 

= x.^{%{x)-x^) +x^{%{t)-x^}-^x^{%{x)-x^} + 
= -x-^- xf‘ -x^- . = - iVr^, 

whence, theie being 1) pans, 

iFa-2 1 

’’ 1)0-2 iV^_] ■ 



XL— COKRBLATION ‘ MISCELLAKEOUS THEOREMS. 219 


two records are mingled^ unless the mean mlve of X in the 
second record rs xdnihcal with that xn the first record ^ or the meaxi 
value of Y in the second record is identical with that in the first 
record, or both. 

This follow s almost at once, for if i/j, are the mean values of 
X in the two records the mean values of Y, iVo the 

numbers of observations, and M, K the means when the two 
recoids are mingled, the product-sum of deviations about K is 
( J/, - M){K^ - Z) + BfM^ - ir){K. - X) 

Evidently the first term can only be zeio if or K^Ky 

But the first condition gives 

that is, = 

Similarly, the second condition gives — Both the first 
and second terms can, therefore, only vanish if J/j — J/^ or 
Correlation may accordingly be created by the mingling 
of two records in which X and Y vary round different means. 
(For a more general foim of the theorem cf ref 20.) 

13 Reduction of Correlation due to mingling of uncorr elated 
with correlated pairs — Suppose that obseivations of x and y 
give a correlation-coefficient 

r 

’ TC,(r^/ 

Now let n^ pairs be added to the material, the means and 
standard-deviations of x and y being the same as in the first 
senes of observations, but the correlation zero. The value of 
%{xy) will then be unaltered, and we will have 

r. 

^ (wj + 'n„)<T^^' 

Whence .... (14) 

Suppose, for example, that a number of bones of the human 
skeleton have been disinterred during some excavations, and 
a correlation is observed between pairs of bones presumed 
to come from the same skeleton, this correlation being rather 
lower than might have been expected, and sub 3 ect to some 
uncertainty owing to doubts as to the allocation of certain 
bones If r\ is the value that would be expected from other 
records, the difference might be accounted for on the hypothesis 



220 


THEORY OF STATISTICS. 


that, in a proportion of all the pairs, the bones do 

not really belong to the same skeleton, and haTe been vntually 
paired at random. (For a more general form of the theorem c/. 
again ref 20 ) 

14 The Weighted Mean — The aiithmetic mean M of a series 
of values of a vanable X was defined as the quotient of the sum 
of those values by their number X, or 

M^^{X)IN, 

If, on the other hand, we multiply each several observed 
value of X by some numerical coefficient or weight IF, the 
quotient of the sum of such products by the sum of the weights 
is defined as a weighted mean of X, and may be denoted by M* 
so that 

jr = S(]FX)/S(lF) 

The distinction between “ weighted ” and unweighted ” means 
is, it should be noted, very often formal rather than essential, 
for the “ weights ’’ may be regarded as actual, estimated, or 
virtual frequencies The weighted mean then becomes simply 
an arithmetic mean, m which some new quantity is regarded 
as the unit Thus if we are given the means J/g, i/g 
Mr of r senes of obseivations, but do not know the number 
of observations m every senes, we may form a general average 
by taking the arithmetic mean of all the means, viz 
treating the series as the unit But if we know the number 
of observations m every series it will be better to form the 
vjeighted mean '%{NM)j'^{X), weighting each mean in proportion 
to the number of observations m the series on which it is based 
The second form of average would be quite correctly spoken 
of as a weighted mean of the means of the several series at 
the same time it is simply the arithmetic mean of all the 
series pooled together, ie. the arithmetic mean obtained by 
treating the observation and not the series as the unit 
(Chap VTL § 13.) 

15 To give an arithmetical illustration, if a commodity is sold 
at different prices m different markets, it will be better to form 
an average price, not by taking the arithmetic mean of the several 
market prices, treating the market as the unit, but by weighting 
each price in proportion to the quantity sold at that price, if 
known, i e treating the unit of quantity as the unit of frequency 
Thus if wheat has been sold in market A at an average price of 
29s Id per quarter, in market B at an aveiage price of 27s 7d , 
and m market G at an average price of 28s 4d , we may, if no 
statement is made as to the quantities sold at these prices (as very 



XL — ^COEEELATION ; MISCELLANEOUS THEOEEMS 221 


often happens in the case of statements as to market prices), take 
the arithmetic mean (28s 4d ) as the general average. But if we 
know^ that 23,930 qrs were sold at A, only 26 qrs. at and 3933 
qrs. at 0, it will be better to take the weighted mv&an 

(29s Id X 23,930) + (27s 7d x 26) + (28s. 4d. x 3933) 

27889 

to the nearest penny This is appreciably higher than the 
anthmetio mean price, which is lowered by the undue importance 
attached to the small markets B and C 

In the case of mdei-numbei& for exhibiting the changes in 
average prices fiom year to year (c/. Chap VII. § 25), it may 
make a sensible difference whether we take the simple arithmetic 
mean of the index-numbers for different commodities m any one 
year as representing the price-level m that year, or weight the 
index-numbers for the several commodities according to their 
importance from some pomt of view ; and much has been wntten 
as to the weights to be chosen. If, for example, our standpoint 
be that of some average consumer, we may take as the weight for 
each commodity the sum which he spends on that commodity m 
an average year, so that the frequency of each commodity is 
taken as the numhei of shillings or pounds spent thereon instead 
of simply as unity 

Eates or ratios like the birth-, death-, or marriage-rates of a 
country may be regarded as weighted means For, treating the 
rate for simplicity as a fraction, and not as a rate per 1000 of the 
population, 

^ ^ births 

Birth-rate of whole country = total jo^^ilatl^ 

^(birth-rate m each district x population in that district) 

~ ^(population of each district) 

i e the late foi the whole country is the mean of the rates m the 
different distiicts, weighting each in proportion to its population 
We use the weighted and unweighted means of such rates as 
illustrations in §17 below 

16 It IS evident that any weighted mean will m general differ 
from the unweighted mean of the same quantities, and it is 
required to find an expression for this difference If r be the 
correlation between weights and variables, Uy, and o-^ the standard- 
deviations, and w the mean weight, w^e have at once 

§( TT.X) = N{M w -1 - 1 

M’ = M+ra-,^ . 

w 


whence 


(15) 



222 


THEOEY OF STATISTICS. 


That is to say, if the weights and variables are positively correlated, 
the weighted mean is the gi eater, if negatively, the less In some 
cases r is very small, and then weighting makes little difference, 
but in others the difference is large and important, r having a 
sensible value and ^ large value 

17 The difference between weighted and unweighted means 
of death-rates, birth-iates or other rates on the population in 
different districts is, for instance, nearly always of impoitance 
Thus we have the following figuies for rates of pauperism 
(Jour, Stat Soc^ vol lix. (1896), p. 349). 


January 1. 

Percentages of the Population in 
receipt of Relief 

Aiithmetic Mean 
of Rates in 
different Districts. 

England and 
Wales as a 
whole. 

1850 i 

6 51 

6 80 

1860 ! 

5 20 

4 26 

1870 

5 45 

4 77 

1881 

3 68 

3 12 

1891 

3 29 

2 69 


In this case the weighted mean is markedly the less, and the 
correlation between the population of a district and its pauperism 
must therefore be negative, the larger (on the whole urban) dis- 
tricts having the lower percentage in receipt of relief. On the 
other hand, for the decade 1881-90 the average birth-rate for 
England and Wales was 32 34 per thousand, the arithmetic 
mean of the rates for the different districts 30 34 only The 
weighted mean was therefore the greater, the birth-rate being 
higher m the more populous (urban) districts, m which there is 
a greater proportion of young married persons 

For the yeai 1891 the average population of a Poor-law district 
was found to be roughly 45,900 and the standard-deviation o-^ 
56,400 (populations ranging from under 2000 to over half a 
million) The standard-deviation o-^ of the percentages of the 
population m receipt of relief was 1 24. We have therefoie, 
for the correlation between pauperism and population, 

3 29 - 2 69 459 
1 24 ^ 564 

= - 0,39 




XL — CORRELATION: MISCELLANEOUS THEOREMS. 223 


For the birth-rate, on the other hand, assuming that crjm 
IS approximately the same for the decade 1881-90 as in 1891, 
we have, 0 - 3 , being 4 08, 

32 34-30 34 459 
4 08 ^564 

= + -40. 

The closeness of the numerical values of r in the two cases is, 
of course, accidental 

18 The principle of weighting finds one very important 
application in the treatment of such rates as death-rates, which 
are laigely affected by the age and sex-composition of the popula- 
tion Neglecting, for simplicity, the question of sex, suppose the 
numbers of deaths are noted m a certain distnct for, say, the 
age-groups 0 -, 10 -, 20 -, etc , in which the fractions of the whole 
population are Pj, jOg? > where 2 (p) = l. Let the death- 
rates for the corresponding age-groups be d;^, dgj Then 
the ordinary or crude death-rate for the district is 

D=^l(d.p) .... (16) 

For some other district taken as a basis of compaiison, perhaps 
the country as a whole, the death-rates and tractions of the 
population in the several age-groups may be S, 83 . , tt., 

TTg . . , and the crude death-rate 

A = S(S7r) . • • . (17) 

Now D and A may differ either because the s and S^s differ 
or because the p's and tt’s differ, or both. It may happen that 
really both districts are about equally healthy, and the death- 
rates approximately the same for all age-classes, but, owing to a 
difference of weighting, the first average may be markedly higher 
than the second, or vice versd If the fi.rst district be a ruial 
district and the second urban, for instance, there will be a larger 
proportion of the old in the former, and it may possibly have a 
higher crude death-rate that the second, m spite of lower death- 
rates in every class The comparison of crude death-rates is 
therefoie liable to lead to erroneous conclusions The difficulty 
may be got over by averaging the age-class death rates m the 
district not with the weights P 2 Pz • • by its own 

population, but with the weights, ttj ttq . given by the 

population of the standard district. The standa7 dised death rate 
for the district will then be 


D'==%{dir) 


( 18 ) 



224 


THEOKY OF STATISTICS. 


and D* and A will be comparable as regards age-distribution. 
There is obviously no difficulty in taking sex into account as well 
as age if necessary The death-rates must be noted for each sex 
separately in every age-class and averaged with a system of 
weights based on the standard population The method is als6 
of importance for comparing death-rates m different classes of the 
population, e,g, those engaged m given occupations, as well as m 
di&rent districts, and is used for both these purposes m the 
Decennial Supplements to the Reports of the Registrar General 
for England and Wales (ref 16). 

19 Difficulty may arise in practical cases from the fact that 
the death-rates ... are not known for the districts or 

classes which it is desired to compare with the standard popula- 
tion, but only the crude rates D and the fractional populations 
of the age-classes P 2 Pz • ^ difficulty may be partially 

obviated (c/. Chap IV. § 9, pp 51-3), by forming what is 
teimed an index death rate A'' for the class or district, A' being 
given by 

A' = 2(Siti) .... (19) 

i»e the rates of the standard population averaged with the 
weights of the district population. It is the crude death-rate 
that there would be m the district if the rate in every age- 
class were the same as- in the standard population An 
approximate standaidised death-rate for the distiict 01 class is 
then given by 

D" = i)xA .... (20) 

D” IS not necessarily, nor generally, the same as D\ It can 
only be the same if 

%{d^) S(a7r) 

'^{dp)^ S(Sp)' 

This will hold good if, eg^ the death-rates in the standard 
population and the district stand to one another m the same 
ratio in all age-classes, 4.e. SJdj^ == 32 M 2 = method 

of standardisation is used in the Annual Summaries of the 
Registrar-General for England and Wales. 

Both methods of standardisation — that of § 18 and that of the 
present section — are of great and growing importance. They are 
obviously applicable to othei lates besides death-rates, e g birth- 
rates (c/. refs 17, 18) Further, they may readily be extended 
into quite different fields Thus it has been suggested (ref 19) 
that standardised average heights or standardised average weighu 



XL — CORKEU.TION: MISOBLLANIOUS THKOREMa 225 


of the children in different schools might be obtained on the 
basis of a standard school population of given age and sex 
composition, or indeed of given composition as regards hair and 
eye-colour as well. 

20. In g§ 14-17 we have dealt only with the theory of 
the weighted arithmetic mean, but it should be noted that 
any form of average can be weighted Thus a weighted median 
can he formed by finding the value of the vanable such that 
the sum of the weights of lesser values is equal to the sum 
of the weights of greater values. A weighted mode could be 
formed by finding the value of the vanable for which the sum 
of the weights was greatest, allowing for the smoothmg of 
casual fluctuations. Similarly, a weighted geometric mean could 
be calculated by weighting the loganthms of every value of the 
variable before taking the arithmetic mean, 


log = 


^(W, logX) 

S(W) • 


REFERENCES 

Effect of Grouping Observations. 

(1) Sheppakd, W F , ‘ On the Calculation of the Average Square, Cube, etc., 

of a large number of Magnitudes,” Jour Roy. Slat. Soc , vol. li., 1897, 
p 698 

(2) Sheppard, W F., “On the Calculation of the most probable Values of 

Frequency Constants for Data arranged according to Equidistant 
Divisions of a Scale,” Proc, LotuI Math Soc , vol. rsix p. 363. (The 
result given in eqn. (4) for tbe correciion of the standard-deviation is 
Sheppard’s result.) 

(3) Sheppard, W, F., “The Calculation of Moments of a Frequency -distribu- 

tion,” JBiometrika^ v., 1907, p 450. 

(4) Pearson, Karl, and others [editorial], “On an Elementary Proof of 

Sheppard's Formulae for correctmg Raw Moments, and on other aUied 
pomts,” Bwmetrxka^ vol lu., 1904, p 308 

(5) PEAP.SON, Karl, “On the Influence of * Broad Categones ’ on Correlation,” 

Biometnhaj vol ix,, 1913, pp. 116-139. 

Effect of Errors of Observation on the Correlation-coefficient. 

(6) Spfarman, C , “ The Proof and Measurement of Association between Two 

Things,” Amer. Jour, of Psychology, voL xv., 1904, p. 88 
(Formula (8) ) 

(7) Spearman, G., “ Demonstration of Formulae for True Measurement of 

Correlalion,” Amer, Jour, of Psychology, vol xvui,, 1907, p. 161 
(Proof of formula (8), but on different Imes to that given m the text, 
which was communicated to Spearman m 1908, and published by 
Brown and by Spearman in (8) and (10) ) 

(8) Spearman, C , “ Correlation cMculated from Faulty Data,” British Jov/r 

of Psychology, vol, ui., 1910, p, 271. 


15 



226 


THEOEY OF STATISTICS. 


(9) Jacob, S M., ‘^On the Correlations of Areas of Matured Crops and the 
Kamfall/* Mem, Asiatic Soc Bengal, vol u , 1910, p 847 (§ 7 

contains remarks on the effects of eirois on the correlations and 
regressions, with especial reference to this pioblem ) 

(10) Brown, W., “Some Experimental Results in OoT:iQ\a,tiQTi** Proceedings 

of the Sixth International Congress of Psychology , Geneva, August 1909. 

Corielations between Indices, etc 

(11) Pearson, Karl, “On a Form of Spurious Correlation which may arise 

when Indices are used m the Measuiement of Organs,” Proc Boy Soij , 
vol lx , 1897, p. 489 (§§ 8, 9 ) 

(12) Galton, Francis, “FTote to the Memoir by Prof Karl Pearson on 

Spuiious Coirelation,” ibid , p 498 

(13) Yule, G IT , “ On the Interpietation of Correlations between Indices or 

Ratios,” Jowr Boy Stat Soc ,yo\ Ixxm., 1910, p 644 

(14) Brown, J. W , M Greenwood, and Frances Wood, “A Study of 

Index-Correlations,” «7bwr Boy Stat Soc,,yo\ Ixxvii., 1914, pp 317-46 

The Weighted Mean. 

(15) Pearson, Karl, “Note on Reproductive Selection,” Proc Boy. Soc , 

vol lix , 1896, p. 301 (Eqn (15) ) 

Standardisation or Correction of Death-rates, etc. 

(16) Tatham, John, Supplement to the Fifty-fifth Annual Report of the 

Begisti ar-General for England and Wales Introductoip Letters to 
Pt I and PL II, Also Supplement to Suxty -fifth Report Introductory 
Letter to Pt II, (Cd 7769, 1895 , 8503, 1897 , 2619, 1908) 

(17) Newsholme, a , and T H. 0. Stevenson, “The Decline of Human 

Fertility in the United Kingdom and other Countries, as shown by 
Collected Birth-rates,” t/bwn Stat , vol Ixix., 1906, p 34 

(18) Yule, G U., “On the Changes m the Marriage and Birth Rates in 

England and Wales during the past Half Century,” etc , ibid , p. 88 

(19) Heron, David, “The Influence of Defective Physique and Unfavouiable 

Home Environment on the Intelligence of School Children,” Eugenics 
Laboratory Memoirs, vm. , Dulau& Co., London, 1910. 

Miscellaneous. 

(20) Pearson, Karl, Alice Lee, and L. Bramley-Moorb, “Genetic 

(reproductive) Selection Inheritance of Fertility in Man and of 
Fecundity in Thoroughbred Racehorses,” PkH Trans Boy Soc,, 
Senes A, vol cxcii,, 1899, p. 257. 

(A number of theorems of general application are given in the intro- 
ductory part of this memoir, some of which have been utilised m §§ 12- 
13 of the preceding chapter.) 

EXERCISES 

1 Find the values obtained for the standard-deviations in Examples ii. 
(p. 139) and in. (p. 141) of Chapter VIII on applying Sheppard’s correction 
for grouping 

2 Show that if a range of six times the standard-deviation covers at least 18 
class-mteivals {cf Chap VI § 5), Sheppard’s correction will make a difference 
of less than 0‘5 per cent, in the rough value of the standard-deviation 

3 (Data from the decennial supplements to the Annual Reports of the 
Registrar General for England and Wales ) The followmg particulars aie 



XL — COKSELATION; MISCELLANEOUS THEOBEMS. 227 


found for 36 small registration districts in which the number of births in a 
decade ranged between 1600 and 2500 — 


Decade. 

Proportion of Male Births 
per 1000 of all Births. 

Mean. 

Standard- 

deviation 

1881-1890 

508 1 

12 80 

1891-1900 . 

508 4 

10 37 

Both decades 

508 25 

11*65 


It IS believed, however, that a great part of the observed standard-deviation 
IS due to mere fluctuations of sampling” of no real sigmficance 

Given that the correlation between the proportions of male births m a 
district in the two decades is 4- 0 36, estimate (1) the true standaid-deviation 
freed from such fluctuations of sampling , (2) the standard-deviation of fluctua- 
tions of sampling, % e of the errors produced by such fluctuations in the observed 
proportions of male births 

4. (Data from Pearson, ref 11 ) The coefflcients of variation for breadth, 
height, and length of certain skulls are 3*89, 3 50, and 3 24 per cent, respec- 
tively Fmd the spurious correlation” between the breadth/length and 
height/length mdices, absolute measures hemg combmed at random so that 
they are uncorrelated 

5. (Data from Boas, communicated to Pearson ; cf. Fawcett and Pearson, 
Proc, Boy Soc , vol Ixii. p. 413 ) From short senes of measurements on 
Amencan Indians the mean coefficient of correlation found between father and 
son, and father and daughter, for cephalic mdex, is 0 ] 4 , between mother and 
son, and mother and daughter 0 33 Assuming these coefficients should be 
the same if it were not for the looseness of family relations, find the proportion 
of children not due to the reputed father 

6. Fmd the correlation between Ari-i-X2 and X2+X3 , Xg and Xj bemg 
uncorrelated. 

7. Fmd the correlation between Xj and aXi+^Xg, Xi and Xg bemg 
uncorrelated 

8. (Refemng to illustration iv., § 14, Chap. X.) Use the answer to 
question 7 to estimate, very roughly, the correlation that would be found 
between annual movements m infantile and general mortahty if the mortality 
of those under and over 1 year of age were uncorrelated. Note that — 


“infantile mortahty per 1000 births x ■ — 

^ population 

-f deaths over one year per 1000 of population. 


and treat the ratio of births to population as if it were constant at a rough 
average value, say 6 033 The standard-deviation of annual movements in 
infantile mortality is (Zoc at ) S tkat of annual movements m mortahty 

other than mfantile may be taken as sensibly the same as that of general 
mortality, or say 1 unit 
9 If the relation 


axi+h X2+c.Xg=^0 




228 


THEOEY OF STATISTICS 


holds for all values of Xq and % (which are, in our usual notation, 
deviations from their respective arithmetic means), find the coiTelations 
between ajj, and ccg in teims of their standard-deviations and the values of 
a, & and c. 

10. What IS the efiect on a weighted mean of errors in the weights or the 
quantities weighted, such enors being uncorrelated with each other, with the 
weights, 01 with the variables — (1) if the arithmetic mean values of the errors 
are zero , (2) if the arithmetic mean values of the errois are not zero ^ 

11. 0/ (Pearson, “On a Generalised Theory of Alternative Inheiitance,” 
PMZ. Tram»j vol cciu., A, 1904, p 53) If we consider the correlation 
between number of recessive couplets in parent and in offspring, m a 
Mendelian population breeding at random (such as would ultimately result 
from an initial cross between a pure dominant and a pure recessive), the 
correlation is found to be 1/3 for a total numbei of couplets n If the 
only possible numbers of recessive couplets are 0 and 1, and the correlation 
table between parent and ofifspiing reduces to the form 


Offspring 

Paient 

0 

1 

Total 

0 

5 

D 

6 

1 

1 

B 

2 

Total 

6 

2 

8 


Verify the correlation, and work out the association coefficient Q. 

12 {Cf, the above, and also Snow, Proc Poy Soc , vol. Ixxxiii , B, 1910, 
Table III , p 42 ) For a similai population the coirelation between 
brothel s, assuming a practically infinite size of family, is 6/12. The table is 


Second 

Brother 

First Brother 

0 

1 

Total 


D 

B 

48 

■ 

B 

9 

16 

Total 

48 

16 

64 


Verify the correlation, and woik out the association coefficient Q. 

13 Refeiring to the notation of § 10, show that we have the following 
expressions for the regiessions in a fourfold table 

{£) ” (iS) 

^5 _{AS) (aB) 

O'! {A)ia) {A) “■ (a) 

Yenfy on the tables of questions 11 and 12 














CHAPTER XIL 

PARTIAL COREELATIOR. 

1“2. lafcroductory explanatiou— 3 Direct deduction of the formula for two 
variables — 4, Special notation for the general case . generalised re- 
gressions — 5 Generalised correlations— 6. Generalised deviations and 
standard-deviations — 7~8 Theorems concerning the generalised pro- 
duct-snnas — 9 Direct inierpretation of the generalised regressions — 
10-11. Reduction of the geneialised standard deviation — 12 Reduc- 
tion of the generalised regression — 13 Reduction of the generalised 
correlation-coefiBcient — 14. Arithmetical work Example l Example 
iL — 16 Geometrical representation of correlation between three 
variables by means of a model — 16. The coefficient of 7i-fold correlation 
— 17. Expression of regressions and correlations of lower in terns of 
those of higher order — 18. Limiting inequahties between the values of 
correlation-coefficients necessary for consistence — 19 Fallacies 

1 In Chapters IX. -XI. the theory of the correlation-coefiBcient for 
a single pair of variables has been developed and its applications 
illustrated. But in the case of statistics of attributes we found 
it necessary to proceed from the theory of simple association for 
a smgle pair of attributes to the theory of association for several 
attributes, in order to be able to deal with the complex causation 
characteristic of statistics , and similarly the student will find it 
impossible to advance very far in the discussion of many problems 
m correlation without some knowledge of the theory of mvltvpU 
correlation, or correlation between several variables In such a 
problem as that of illustration i., Chap X , for instance, it might 
be found that changes m pauperism were highly correlated 
(positively) with changes m the out-relief ratio, and also with 
changes m the proportion of old , and the question might anse how 
far the first correlation was due merely to a tendency to give out- 
relief more freely to the old than the young, to a correlation 
between changes in out-relief and changes m proportion of old. 
The question could not at the present stage be answered by work- 
ing out the correlation-coefiBcient between the last pair of variables, 
for we have as yet no guide as to how far a correlation between 

229 



230 


THEOKY OF STATISTICS. 


the variables 1 and 2 can be accounted for by correlations 
between 1 and 3 and 2 and 3 Again, in the case of illustration in., 
Chap X , a marked positive correlation might be observed between, 
say, the bulk of a crop and the rainfall during a certain period, and 
practically no correlation between the crop and the accumulated 
temperature duiing the same period , and the question might arise 
whether the last result might not be due merely to a negative 
correlation between ram and accumulated temperature, the crop 
being favourably affected by an increase of accumulated temper- 
ature z/ other things were equals but failing as a rule to obtain this 
benefit owing to the concomitant deficiency of ram. In the prob- 
lem of inheritance in a population, the corresponding problem is 
of great importance, as already indicated m Chapter TV. It is 
essential for the discussion of possible hypotheses to know whether 
an observed correlation between, say, grandson and grandparent 
can or cannot be accounted for solely by observed correlations 
between grandson and parent, parent and grandparent 

2 Problems of this type, m which it is necessary to consider 
simultaneously the relations between at least three variables, and 
possibly more, may be treated by a simple and natural extension 
of the method used m the case of two variables The latter case 
was discussed by forming linear equations between the two 
variables, assigning such values to the constants as to make the 
sum of the squares of the errors of estimate as low as possible * 
the more complicated case may be discussed by forming linear 
equations between any one of the n variables involved, taking 
each in turn, and the - 1 others, again assigning such values to 
the constants as to make the sum of the squares of the errors of 
estimate a minimum If the variables are Xg Xg . . . , X^, 
the equation will be of the form 

Xj = d ■+• 4* ^3»Xg 4* . . . . Hf" 5„.X„ 

If m such a generalised regression or charactenstic equation we 
find a sensible positive value for any one coefficient such as 
we know that there must be a positive correlation between 
and Xg that cannot be accounted for by mere correlations of X^ 
and Xg with Xg, X 4 , or X^^ for the effects of changes in these 
variables are allowed for in the remaining terms on the right 
The magnitude of gives, in fact, the mean change in X^ 
associated with a unit change in Xg when all the remaining 
variables are kept constant The correlation between Xj and 
Xg indicated by may be termed a partial correlation, as 
corresponding with the partial association of Chapter IV., and it 
is required to deduce from the values of the coefficients h, which 
may be termed partial regressions, partial coefficients of corre- 



XII. — PAETIAL COERELATION. 


231 


lation giving the correlation between Xj and or other pair of 
variables when the remaining variables . . X„ are kept 

constant^ or when changes in these variables are corrected or allowed 
for, so far as this may be done with a linear equation For examples 
of such generahsed regression-equations the student may turn to 
the illustrations worked out below (pp 239-247) 

3, With this explanatory introduction, we may now proceed to 
the algebraic theory of such generalised regression-equations and 
of multiple correlation in general It will first, however, be as 
well to revert briefly to the case of two variables In Chapter IX , 
to obtain the greatest possible simplicity of treatment, the value 
of the coefficient was deduced on the special assump- 

tion that the means of all arrays were strictly collinear, and the 
meaning of the coefficient in the more general case \\as sub- 
sequently investigated. Such a process is not conveniently 
applicable when a number of variables are to be taken into 
account, and the problem has to be faced directly : i.e. required^ 
to determine the coefficients and constant term^ if any^ in a 
regression-equation^ so as to make the sum of the squares of the 
errors of estimate a minimum We will take this problem first 
for the case of two variables, introducing a notation that can be 
conveniently adapted to more. Let us take the arithmetic 
means of the variables as origins of measurement, and let 
denote deviations of the two variables from their respective 
means Then it is required to determme Oj and 
gression-equation 

so as to make -a^ + b -^2 ^ 2)^9 associated pairs of 

deviations Xj^ and x^y the least possible Put more briefly, if 
we write 

X sf 2 = ^(^1 ~ ^12*^2)^ • • • (^) 

so that § 12 ^ root-mean-square value of the errors of estimate 
m using regression-equation (a) (c/. Chap IX § 14), it is required 
to make S 12 ^ minimum Suppose any value whatever to be 
assigned to ^ senes of values of to be tried, ^ 

calculated for each Evidently 5^2 very large for 

values of that erred greatly either in excess or defect of the 
best value (for the given value of 6 ^ 2 ), and would continuously 
decrease as this best value was approached , the value of s^ ^ 
never become negative, though possibly, but exceptionally, zero 
If therefore the values of Sj 2 were plotted to the values of on 
a diagram, a curve would be obtained more or less like that 
of fig 44 The best value of a^, for which s -^2 attained its 



232 


THEORY OF STATISTICS. 


minimum value, say otj g? could be approximately estimated from 
such a diagram ; but it can be calculated with much more exact- 
ness from the condition that vf two values close above 

wnd heloit) the hest^ the correspond%ng values of S| g are equal. Let 
Oj and (<Xi + S) be two such values Then if 

5(^1 — -f ^12*^2)^ ~ ^(^1 “ ^ + ^12*^2)^ 


when B is very 
value of 
the term in 

that is, 


small, the value of is the best for the assigned 
But, evidently, the equation gives, neglecting 


a^ = 0 


whatever the value of fejg. direct proof of the 



result that no constant term need be introduced on the right 
of a regression-equation when written m terms of deviations 
from the arithmetic mean, or that the two lines of regression 
must pass through the mean (Chap. IX. § 10). We may 
therefore omit any constant term. If, now, 5^2 assigned 

the best value, we must have, by similar reasoning, for slightly 
differing values, 5^2 + 

2(a?3 — ^12*^2)^ ~ 2(^1 "" [^12 d" SJiTg)®. 

That is, again neglecting terms in B% 

” ^12*^2) “ ^ • • * (^) 

or, breaking up the sum, 

- <7, 



Xn— PAKTIAIi CORRELATION. 


233 


which IS the value found by the previous indirect method of 
Chapter IX. From the fact that is determ in ed so as to 
make the value of the least possible, the method 

of determination is sometimes called the method of least sqtiares. 
Evidently all the remaining results of Chapter IX follow from 
this, and notably we have for minimum value of 

the standard-deviation of errors of estimate 

= . . . (<i) 

4 . Now apply the same method to the regression-equation 
for n vanables Writing the equation m terms of deviations, 
it follows from reasoning precisely similar to that given above 
that no constant term need be entered on the right-hand 
side. For the partial regression-coefi&cients (the coefficients of 
the on the right) a special notation will be used in order 
that the exact position of each coefficient may be rendered quite 
defimte. The first subscnpt affixed to the letter b (which will 
always be used to denote a regression) will be the subscript of 
the X on the left (the dependent variable), and the second will 
be the subscnpt of the x to which it is attached; these may 
be called the primary subscripts After the primary subscripts, 
and separated from them by a point, are placed the subscripts 
of all the remaining variables on the right-hand side as secondary 
subscripts The regression-equation will therefore be written 
in the form 

~ ^12.34 ... fl * ^2 "h ^13 24 « • ^3 "h • • • "h ^lrt.23 (n-l ' • (^) 

The order in which the secondary subscripts are written is, 
it should be noted, quite indifferent, but the order of the 
primary subscnpts is material , e g „ and h^i 3 n 

denote qmte distinct coefficients, x^ being the dependent variable 
in the first case and in the second A coefficient with p 
secondary subscripts may be termed a regression of the^th Older. 
The regressions 6^3, 63^, etc , m the case of two variables 

may be regarded as of ordei ^^.ero, and may be termed total as 
distinct fiom partial regressions 

5 In the case of two variables, the correl&twL'CCCff'^^ept 
may be regarded as defined by the equation 

^12 “(^12 

We shall generalise this equation in the form 

^12 34 .n— (^12 34 n ^2134 n)^ • » (2) 

This IS at present a pure definition of a new symbol, and it 
remains to be shown that ri2.g4 . n really be regarded as. 



234 


THEOBY OF STATISTICS. 


and possesses all tbe properties of, a coi i elation-coefficient , the 
name may, however, be applied to it, pending the proof A 
correlation-coefficient with •p secondary subscripts will be termed 
a correlation of order p Evidently, in the case of a correlation- 
coefficient, the order in which both primary and secondary 
subscripts IS written is indifferent, for the right-hand side of 
equation ( 2 ) is unaltered by writing 2 for 1 and 1 for 2 The 
correlations ^135 j regarded as of older zero, and 

spoken of as totals as distinct from partial, coi relations 

6. If the regressions 5^034 ^1324 n, etc , be assigned the 

** best ” values, as determined by the method of least squares, the 
difference between the actual value of and the value assigned 
by the right-hand side of the regiession-eq nation (1), that is, the 
error of estimate, will be denoted by iCi 23 , , . « , as a defini- 
tion we have 

^123 n~^l~^l2 34 n ^2 ^IS24 n ^3 ""^ln23 • ( 3 ) 

where . . Xn are assigned any one set of observed values 

Such an erior (or residual, as it is sometimes called) denoted by a 
symbol with p secondary suffixes, wall be termed a deviation of the 
pth order Finally, we will define a generalised standard-deviation 

23 . ft ^7 the equation 

-^•<^23 . n~S(aq 23 . n) • * * ( 4 ) 

N being, as usual, the number of observations. A standard- 
deviation denoted by a symbol with p secondary suffixes will be 
termed a standard-deviation of the ; 9 th order, the standard- 
deviations cTj cTg, etc , being regarded as of order zero, the standard- 
deviations o-jg 0-21 etc , (cf eqn {d) of § 3 ) of the first order, and 
so on 

7 . From the reasoning of § 3 it follows that the “ least-square ” 

values of the partial regressions ^ n> etc , will be giyjen by 
equations of the form — 


^(^1 "■ ^12 34 

» *2 + • 

(n-1) • 

= “ (^12 34 

ni-0>*2+ • • 

+ ^In 23 («-I) 

^^emg very small 

That IS, neglecting the term m 8^, 


^^2(^1 ^ ^12 34 ft ^2 d" ^ln.2S (n— 1) • ^n) ” 


or, more briefiy, in terms of the notation of equation ( 3 ), 

23 n) ^ • • • (^) 

There are a large number of these equations, (n~l) for determin- 
ing the coefficients ^1234 .. . n* etc , (n- 1) again for determining 



Xn — ^PARTIAL CORRKLATION. 


235 


fehe coefficients ^21 34 n» , and so on ' they are sometimes 
termed the norm^ equations If the student will follow the pro- 
cess by which ( 5 ) was obtained, he will see that when the con- 
dition is expressed that 612.34 n shall possess the “least-square ” 
value, ^2 enters into the product-sum with ; when the 

same condition is expressed for 613.24 . enters into the 

product-sum, and so on. Taking each regression m turn, m fact, 
every x the suffix of which is included in the secondary suffixes 
oi iCl 23 n enters into the product-sum The normal equations 
of the form ( 5 ) are therefore equivalent to the theorem — 

The product-sum of any deviation of older zero with any deviation 
of higher order is zero^ provided the subscript of the former occur 
among the secondary subscripts of the latter 
8. But it follows from this that 

n. 3^2.34 .n) =Sa:is 4 . n(« 2 ~ 62 s 4 n 3^ - . . . — ^ 2 « 54 {n -D 

= S(i5i 54 n • 3 J 3 ). 

Similarly, 

S(a:i34 „ a; 2 S 4 n) =S(a:i x^zi n) 

Similarly again, 

n Xzsi {n— l)) = 2(aJi 34 n « 

and so on Therefore, quite generally, 

^(^1 34 n • ®2 34 n) = ^(^1 5i . (n~l) • ^2.34 , n) 

s=5(37i.a?234 . n) 

= 2(a;i 34 n 072,34 . . (n-1)) 

~ ^(^L 34 . n • ^2) 

Comparing all the equal product-sums that may he obtained 
in this way, we see that the product-sum of any two deviations is 
unaltered by omitting any or all of the secondary subscripts of either 
which are common to the two^ and^ conversely^ the product-sum of any 
deviation of order p with a deviation of order p -f q, the p subscripts 
being the same in each case, is unaltered by adding to the secondary 
subscripts of the former any or all of the q additional subscripts of 
the latter 

It follows therefore from ( 5 ) that any product-sum is zero if all 
the subscripts of the one deviation occur among the secondary sub- 
scripts of the other As the simplest case, we may note that is 
uncorrelated with i, and ojg uncorrelated with g* 

The theorems of this and of the preceding paragraph are of 
fundamental importance, and should be carefully remembered. 




236 


THEORY OF STATISTICS. 


9 . We have now from §§ 7 and 8 — 


CO 

11 

o 

ft • 234 n) 

— 2^72.34 

n (^*^1 ~ ^1234 ft *^3 ■“ teams in x^ to x„) 

= 2(^1 ^2 34 

n) ■" ^12 34 n ^(^2 ^2 34 n) 

= 34 

n ^2 34 n) “■ ^12 34 ft 2(^1 34 n) 

That is 

2(^134 ft *^2 34 n) 

n) • 

A 

<^12 34 

But this is the value that would have been obtained by taking a 


1 egression-equation of the form 

^hSi n “ ^12 34 n • ^2 34 n 

and determining ^12.34 „ hy the method of least-squares, i,e 

^12 34 ft as the regiession of ^^134 ^ 0007234 n* follows 

at once from (2) that ^12.34 « is the correlation between 

o?! 34 n and 072.34 and from ( 4 ) that we may write 

^1234 ft = ^1234 ^ (8^ 

0^2 34 ft 

an equation identical with the familiar relation ^32*^^i2'^i/^2> 
with the secondary suffixes 34 . . n added throughout. 

To illustrate the meaning of the equation by the simplest case, 
if we had three variables only, 07^, and 07 g, the value of 6^2 3 
r-^2B oould be determined (1) by finding the correlations and 
rgs and the corresponding regressions and , (2) working out 
the residuals 07^ - 073 and - &23-^3 ^ associated deviations , 

( 3 ) working out the correlation between the residuals associated 
with the same values of ajg. The method would not, however, be 
a practical one, as the arithmetic would be extremely lengthy, 
much more lengthy than the method given below for expressing 
a correlation of order p m terms of correlations of order p — 1 
10 . Any standard-deviation of orders may be expressed in terms 
of a standard-deviation of order jp - 1 and a correlation of orders - 1. 
For, 

^(^1 23 ft) ~ ^(^1 23 (n-l) • ^ 23 n) 

= 23 (n-l))(^l “* ^In 23 (n-l)^w “ tcrmS in tO 

— 23 (n-l)) “ ^ln.23 (n-l) ^(^1 23 (n-l) ^ft.23 (n-l)) 

or, dividing through by the number of observations, 

^^25 , . . . <^23 . (n-l)(l “ ^lft23 (n-l) • ^nl 23 («-l)) 

“^23 . . . (n-l)(l *“ ^in 23 .. (n-l)) • • * ^9) 



XII — PARTIAL correlation. 


237 


This IS again the relation of the familiar form — 

C7i. = a|(l -Ji,) 

with the secondary suffixes 23 . . , . (« - 1) added throughout 
It IS clear from (9) that like any correlation of order 

zero, cannot be numerically greater than unity. It also follows 
at once that if we have been estimatmg from 
Xn win not increase the accuracy of estimate unless (n_ij 

(not ri„) differ from zero This condition is somewhat interesting, 
as it leads to rather unexpected results For example, if = + 0*8, 

^18 “ "b ^ ^23 = "b possible to estimate with 

any greater accuracy from and x^ than from x^ alone, for the 
value of is zero (see below, § 13). 

11. It should be noted that, in equation (9), any other subscript 
can be eliminated in the same way as subscript n from the suffix of 
o'i .23 . nj so that a standard-deviation of order p can be expressed 
in p ways m terms of standard-deviations of the next lower order 
This is useful as affordmg an independent check on arithmetic. 
Further, cti ^ . (»_!) can be expressed m the same way in terms 

<rL 23 (n~ 2 )> so on, so that we must have 

<^23 “^)(1 “^ 3 . 2 )(I ‘”'^4.23) . • . (1 ~^n. 2 S (ft-l)) • 

This IS an extremely convenient expression for arithmetical use , 
the arithmetic can again be subjected to an absolute check by 
eliminating the subscripts m a different, say the inverse, order 
Apart from the algebraic proof, it is obvious that the values must 
be identical , for 5 we are estimating one variable from n others, it 
IS clearly indifferent in what order the latter are taken into account 
12 Any regression of order p may be expressed in terms of 
regressions of order p ~ 1. For we have 

S(a^34 n. 232.34 n) = S(a!i.34 («-l) . 232.34 . «) 

= Sail 34 (n-l}(a32 ~ J2n 34 (n-l) “ tennS m % to aJn-l) 

= S(a;i .34 (n-1) 232 34 (n-1)) - &2n 34 34 {n~l} XnSi. 

Replacing („_i) by 34 (n-1) ■ ^2.34 34 . («-!)> 

we have 

2>12S4 n <r|34 n = 6l2 34 (n-1) 0-| C4 {«-l)-&ln34 (n-l).2'«2 34 (n-l) cr|34 (n~l), 

or, from (9), 

_^12 34 (n-l)“^ln34 (n~l) ^n2 34 , (n-l) 

^12.34 n — 1 /i h 

J- ” ^2n 34 (n-1) <^n2.3i (n-l) 

The student should note that this is an expression of the form 

2 . _ ^12 “ ^In * ^n 2 


( 11 ) 



238 


THEORY OF STATISTICS 


With the subscripts 34 . (t^ - 1 ) added throughout. The 

coefficient 61234 „ may therefore be regarded as determined 

from a regression-equation of the form 

^L34 (n-l) “ ^12.34 n • ^2 34 (n-1) + 6i^2S (n~l) S4 (n-l)* 

%e it IS the partial regression of 34 on u (n-i)» 

^«34 (n~i) being given As any other secondary suffix might 

have been eliminated in lieu of n, we might also regard it as 
the partial regression of 45 « on ^45 n > 45 n being 

given, and so on 

13 From equation (11) we may readily obtain a corresponding 
equation for correlations For (11) may be written 


L ^12 34 (n-l) *“ ^1«.34 (n-l) ^2n34 (n-l) 0*134 (n-l) 

Oi2 34 n — T Z 

i “ ^2n 34 (n-l) O* 2.34 

Hence, writing down the corresponding expression for 34 „ 

and taking the square root 


(n-l)~^ln34. (n-l) ^2n.34 


(n-l) 


This IS, similarly, the expression for three variables 


( 12 ) 


With the secondary subscripts added throughout, and 7*1234 „ 

can be assigned interpretations corresponding to those of 612 34 „ 

above Evidently equation (12) permits of an absolute check on 
the arithmetic in the calculation of all partial coefficients of an 
order higher than the first, for any one of the secondary suffixes 
^12 34 n be eliminated so as to obtain another equation of 
the same form as (12), and the value obtained for n by 

inserting the values of the coefficients of lower order m the 
expression on the right must be the same in each case 

14 The equations now obtained provide all that is necessary 
for the arithmetical solution of problems in multiple correlation. 
The best mode of procedure on the whole, having calculated all 
the correlations and standard-deviations of order zero, is (1) to 
calculate the corielations of higher order by successive applications 
of equation (12) , (2) to calculate any required standard deviations 
by equation (10) , (3) to calculate any required regressions by 
equation (8) the use of equation (11) for calculating the 
regressions of successive orders directly from each other is com- 
paratively clumsy. We will give two illustrations, the first for 



XU. — PAETUL COERELATION. 


239 


three and the second for four variables. The introduction of 
more variables does not involve any difference m the form of the 
arithmetic, but rapidly increases the amount. 

Example i. — The first illustration we shall take will be a 
continuation of example i of Chapter IX., in which the correla- 
tion was worked out between (1) the average earnmgs of agn- 
cultural labourers and (2) the percentage of the population in 
receipt of Poor-law relief m a group of 38 rural distncts. In 
Question 2 of the same chapter are given (3) the ratios of the 
numbers m receipt of outdoor relief to the numbers relieved in the 
workhouse, in the same districts Required to work out the partial 
correlations, regressions, etc , for these three variables. 

Using as our notation = average earnings, = percentage of 
population in receiptof relief, X^ = out-relief ratio, the first constants 
determined are — 


== 15 9 shillings cr^ = 1*71 shillings »’i 2 = ~ ^ 

i/o = per cent, = 1 29 per cent r ^3 = - 0 13 
5 79 cr3 = 3 09 r23= 4-0 60 


To obtain the partial correlations, equation (12) is used direct in 
its simplest form — 


^ 12,3 — 


^12 ^13 ^*28 

(1 - r£g)^ (1 — 


The work is best done systematically and the results collected 
in tabular form, especially if logarithms are used, as many of the 
logarithms occur repeatedly. First it will be noted that the 
logarithms of occur in all the denominators, these had, 

accordingly, better be worked out at once and tabulated (col. 2 of 
the table below). In col. 3 the product term of the numerator of 


1. 

2 

3 

4. 

5. 

6 

7 

8 i 

9 



Product 

lerm 

Numera 

tor 

log 

Num 

log 

Denoia 

Correlation of 


r. 

log \/l - 



log Vl - r2 


log 

Value 

ri2=~0 66 

ri3= -0 13 

7*23= -i-0 60 

■■ i 

T 87580 ! 

I 99629 

I 90309 

-0 0780 
-0 3960 
+0 0858 

-0 5820 
+0 2660 
+0 5142 

I 76492 
I 42488 
I 71113 

I 89938 
I 77889 
1 87209 

I 86554 
I 64599 
I 83904 

3—0 73 
ru 2+0 44 
^23 1+0 69 

I 83216 

I 95267 

T 85946 


each partial coefficient is entered, i e. the product of the two other 
coefficients on the remaining lines m col 1 , subtracting this from 
the coefficient on the same hue in col 1 we have the numerator(col. 
4) and can enter its logarithm The logarithm of the denominator 
(col 6) is obtained at once by adding the two logarithms of (1 - 
on the remaining lines of the table, and subtracting the logarithms 




^ 240 


THEORY OF STATISTICS 


of the denominators from those of the numeiators we have the 
logarithms of the correlations of the first-order. It is also as well 
to calculate at oncej for reference m the calculation of st andard- 
deviations of the second-order, the values of log \/l - for the 
first-order coefficients (col. 9). 

Having obtained the correlations we can now proceed to the 
regressions If we wish to find all the regression-equations, we 
shall have six regressions to calculate from equations of the form 


^123 "’^128* 


’’is/o'; 


28. 


These will involve all the six standard-deviations of the first 
order cp^g, o-gi, erg 3 , etc. But the standard-deviations of 
the first-order are not in themselves of much interest, and the 
standard-deviations of the second-order are so, as bemg the 
standard-errors or root-mean-square errors of estimate made in 
usmg the regression-equations of the second-order. We may 
save needless arithmetic, therefore, by replacing the standard- 
deviations of the first-order by those of the second, omitting the 
former entirely, and transforming the above equation for 
to the form 

^12 3 ~ ^12 8 • ^1 23/^2 18 


This transformation is a useful one and should be noted by the 
student The values of each <r may be calculated twice inde 
pendently by the foimulae of the form 


"■l 23 = 0'l(l - (1 - 2)* 

= <r,(l - 2 ^ 3 )* (1 


so as to check the a rithmetic ^ the work is rapidly done if the 
values of log Jl have been tabulated The values found are 


log o - j ,3 = 0'06146 

^123 

log <^213 = 1 84584 

T213 

log 0-312 = 0 34571 

O'S.12 


1*15 

0*70 

2-22 


From these and the logaiithms of the r’s we have 


log 6123 = 0 08116, 6123 = - 1*21 
log 621 3 = 1 64993, ^ 01 8 = - 0 45 
log ^31 2 ==5 93024, 6;i2--f0 85 


^i32-p6174, 6 i 32= +0*23 
log ^23 1 = 1 33917, ^23 1 = "f* 0 22 
log 6321 = 0 33891, 6321= +2 18 


That is, the regression-equations are 


( 1 ) - 1*21 052 + 0*23 iPg 

(2) ar2= -0*45 «i + 0 22 a?3 

(3) o?3= +0*85 051 + 2*18 ojg 



XII. — PARTIAL CORRELATION. 


241 


or, transferring the origins to zero, 

(1) Earnings Xj = 4- 19 0 — 1 '21 + 0*23 X^ 

(2) Fawperism Xg = + 9*55 - 0*45 X^ + 0*22 Xg 

(3) Out-relief ratio — — 15*7 +0 85 X.^ + 2 18 Xg 

The units are throughout one shilling for the earnings X^, 1 
per cent for the pauperism Xg, and 1 for the out-relief ratio Xg. 

The first and second regression-equations are those of most 
practical importance The argument has been advanced that 
the giving of out-relief tends to lower earnings, and the total 
coefficient (rj 3==-0 13) between earnings (X;^) and out-relief 
(Xg), though very small {cf. Chap IX § 17), does not seem 
inconsistent with such a hypothesis The partial correlation 
coefficient (r^g g = + 0 44) and the regression-equation (1), how- 
ever, indicate that in unions with a given percentage of the 
population in receipt of relief (Xf) the earnings are highest where 
the proportion of out-relief is highest; and this is, in so far, 
against the hypothesis of a tendency to lower wages It remains 
possible, of course, that out-relief may adversely affect the joomW- 
itg of earning^ e g, by limiting the employment of the old As 
regards pauperism, the argument might be advanced that the 
observed correlation (rgg = +0 60) between pauperism and out- 
relief was in part due to the negative correlation (^18“ " 
between earnings and out-relief Such a hypothesis would have 
little to support it in view of the smallness and doubtful signifi- 
cance of and is definitely contradicted by the positive partial 
correlation ^gg ^ = +0 69, and the second regression-equation The 
bhird regression-equation shows that the proportion of out-relief is 
on the whole highest where earnings are highest and pauperism 
greatest It should be noticed, however, that a negative ratio is 
clearly impossible, and consequently the relation cannot be strictly 
linear , but the third equation gives possible (positive) average 
ratios for all the combinations of pauperism and earnings that 
actually occur. 

Example ii. — (Four variables ) As an illustration of the form 
of the work in the case of four variables, we will take a portion 
of the data from another investigation into the causation of 
pauperism, viz. that described m the first illustration of Chapter X., 
to which the student should refer for details The variables are 
the ratios of the values in 1891 to the values in 1881 (taken as 
100) of— 

1 The percentage of the population in receipt of reKef, 

2. The ratio of the numbers given outdoor relief to the numbers 
relieved m the workhouse, 

3. The percentage of the population over 65 years of age, 

16 



242 


THEORY OF STATISTICS. 


4. The population itself, 

in the metropolitan group of 32 unions, and the fundamental 
constants (means, standard-deviations and correlations) are as 
follows . — 

Table I. 


1 

Means, 

2 

Standard- 

deviations. 

3 

Correlation- 

coefficient. 

4 

log ^/l - r®. 

1 

104 7 

1 

29*2 

12 

+ 0 62 

1 93154 

2 

90 6 

2 

41 7 

13 

-fO 41 

1‘96003 

3 

107 7 

3 

55 

14 

-0*14 

1*99570 

4 

111 3 

4 

23*8 

23 

+ 0 49 

1*94038 



— 

— 

— 

24 

-hO 23 

1 98820 

— 

— 

__ 

— 

34 

+ 0 25 

1 98598 


It is seen that the average changes are not great, the per- 
centages of the population in receipt of relief have increased on 
an average by 4 7 per cent., the out-relief ratio has dropped by 
9 4 per cent, and the percentage of old has increased by 7 '7 
per cent , at the same time as the population of the unions has 
risen on the average by 11 3 per cent At the same time the 
standard-deviations of the first, second, and fourth variables are 
very large As a matter of fact, while m one union the 
pauperism decreased by nearly 50 per cent, and in others by 
20 per cent , in some there were increases of 60, 80, and 90 
per cent , similarly, m the case of the out-relief, in several unions 
the ratio was decreased by 40 to 60 per cent , a consistent 
anti-out-relief policy having been enforced ; in others the ratio 
was doubled, and more than doubled As regards population, 
the more central districts show decreases ranging up to 20 and 
25 per cent , the circumferential districts increases of 45 to 80 
per cent The correlations of order zero are not large, the 
changes in the rate of pauperism exhibiting the highest correlation 
with changes in the out-relief ratio, slightly less with changes 
in the proportion of old, and very little with changes in 
population 

The correlations of the second order are obtained in two steps. 
In the first place, the six coefficients of order zero are grouped in 
four sets of three, corresponding to the four sets of three variables 
formed by omitting each one of the four variables m turn (Table 
II. col. I). Each of these sets of three coefficients is then 
treated m the same manner as in the last example, and so the 




XII. — PARTIAL CORRELATION. 


243 


Tabie II. 


1 

Correlation- 
coefficient 
(Zero Older) 

2 

Product 
Term of 
Numeratoi 

3. 

Numeiator 



4. 

Correlation- 
coefficient 
(First Order) 

log \/l “ 

12 

-HO 52 

+ 0 2009 

+ 0 3191 

12 3 

+ 0*4013 

1 96187 

13 

-HO 41 

+ 0*2548 

+ 0 1552 

13 2 

+ 0 2084 

1 99035 

23 

+ 0 49 

+ 0 2132 

+ 0*2768 

23*1 

+ 0 3553 

1 97070 

12 

+ 0 52 

-0*0322 

+ 0 5622 

12 4 

+ 0*5731 

1 91355 

14 

-0 14 

+ 0-1196 

-0 2596 

14 2 

-0*3123 

1 97772 

24 

+ 0 23 

- 0 0728 

+ 0*3028 

24 1 

+ 0 3580 

1 97022 

13 

+ 0*41 

-0 0360 

+ 0 4450 

13 4 

+ 0*4642 

T 94731 

14 

-0 14 

+ 0 1025 

- 0 2425 

14 3 

-0 2746 

1*98297 

34 

+ 0 25 

-0 0574 

+ 0 3074 

34 1 

+ 0 3404 

1 97326 

23 

+ 0 49 

+ 0 0575 

+ 0 4326 

23 4 

+ 0 4590 

1 94863 

24 

+ 0 23 

+ 0 1225 

+ 0*1075 

24 3 

+ 0 1274 

1 99645 

34 

+ 0 26 

+ 0 1127 

+ 0 1373 

34 2 

+ 0 1618 

1 99424 


Table IIL 


1. 

Corielation- 
co*efficient 
(First Order). 

2 

Product 
Term of 
Numerator 

8 

Numerator. 

4. 

Coi relation - 
coefficient 
(Second Order). 

5. 

log 

12 4 

+ 0 5731 

+ 0 2131 

+ 0*3600 

12*34 

+ 0 457 

1*94901 

13 4 

+ 0 4642 

+ 0 2631 

+ 0 2011 

13*24 

+ 0*276 

1 98277 

23 4 

+ 0*4590 

+ 0*2660 

+ 0 1930 

23 14 

+ 0 266 

1 98408 

12 3 

+ 0 4013 

-0 0350 

+ 0 4363 

12 34 

+ 0 457 

■■ 

14 3 

-0 2746 

+ 0 0511 

-0 3257 

14 23 

-0 359 

1 97013 

24 3 

+ 0 1274 

-0 1102 

+ 0 2376 

24 13 

+ 0 270 

1 98359 

13 2 

+ 0 2084 

- 0 0505 

+ 0 2589 

13*24 

+ 0 276 



14 2 

-0 3123 

+ 0 0337 

- 0 3460 

14*23 

-0 359 

— 

34 2 

+ 0 1618 

-0 0651 

+ 0 2269 

34 12 

+ 0 244 

T 98664 

23 1 

+ 0 3653 

+ 0*1219 

+ 0 2334 

23 14 

+ 0 2b6 



24 1 

+ 0 3680 

+ 0 1209 

+ 0 2371 

24 13 

+ 0 270 

— 

34 1 

+ 0 3404 

+ 0 1272 

+ 0 2132 

34 12 

+ 0 244 

— 





242 


THEORY OF STATISTICS. 


4 The population iJ^st order (Table II col 4 ) are obtained, 
in the nietropolitgg£g[Qi0j2ts are then regrouped m sets of three, 
constants secondary suffix (Table III. col. 1 ), and these 

follows precisely in the same way as the coefficients of order 
In this way, it will be seen, the "value of each coefficient 
ot the second order is arrived at in two ways independently, and 
so the arithmetic is checked . 34 occurs m the first and fourth 

lines, for instance, in the second and seventh, and so on 
Of course slight differences may occur m the last digit if a 
sufficient number of digits is not retained, and for this leason the 
intermediate work should be carried to a greater degree of 
accuracy than is necessary m the final result , thus four places 
of decimals were retained throughout in the intermediate work of 
this example, and three in the final result If he carries out an 
independent calculation, the student may differ slightly from 
the logarithms given in this and the following work, if more or 
fewer figures are retained. 

Having obtained the correlations, the regressions can be calcu- 
lated from the third-order standard-deviations by equations of the 
form (as in the last example), 


^12 34 


12 34 


^1 234 
^2134 


J 


SO the standard-deviations of lower orders need not be evaluated. 
Using equations of the form 


= 0'x(l 


log CTJ 23^=1 35740 

“"l 284 — ® 

log 0-2134=1 60597 

0^2 1S4~ 32*1 

log 0 - 3124=0 65773 

0-3124 = 4-55 

log 0-4.123= 1 32914 

04123 = 21-3 


All the twelve regressions of the second older can be readily 
calculated, given these standard deviations and the correlations, 
but we may confine ourselves to the equation giving the changes 
in pauperism (-2^) in terms of other variables as the most impor- 
tant It will be found to be 


or, transferring the origins and expressing the equation m terms of 
percentage-ratios, 

Zj - - 31 1 -f 0 325^2 + l‘383Xg - 0 - 383 X 4 , 



XIL— PARTIAL CORRELATION. 


245 


^r, again, in terms of percentage-changes (ratio - 100) * — 

Percentage change in pauperism 

== + 1*4 per cent 

+ 0 325 times the change in out-relief ratio 

+ 1 383 „ „ proportion of old. 

- 0 383 „ „ population. 

These results render the interpretation of the total coefficients, 
which might be equally consistent with several hypotheses, more 
clear and definite The questions would arise, for instance, 
whethei the correlation of changes in pauperism with changes in 
out-relief might not be due to correlation of the latter with the 
other factors introduced, and whether the negative correlation with 
changes in population might not be due solely to the correlation 
of the latter with changes in the proportion of old. As a matter 
of fact, the partial correlations of changes in paupeiism with 
changes in out-relief and in proportion of old are slightly less than 
the total correlations, but the partial con elation with changes in 
population IS numerically greater, the figures being 

^12 “+052 ^12 34 ~'^^ 

rig =4-0 41 r,324~4'028 

ri4= - 0 14 ^1423= 

So far, then, as we have taken the factors of the case into 
account, there appears to be a true correlation between changes 
m pauperism and changes in out-relief, pioportion of old, and 
population — the latter serving, of course, as some index to 
changes in general prosperity The relative influences of the 
three factors are indicated by the regression-equation above. 
[For the full discussion of the case cf Jour, Roy, Stat Son, 
vol 1x11 , 1899 ] 

15 The correlation between pauperism and laboureis’ earnings 
exhibited by the figures of Example i was illustrated by a diagram 
(fig 40, p 180), in which scales of “paupeiism” and “earnings” 
were taken along two axes at right angles, and every observed 
pair of values was enteied by maikmg the corresponding point 
with a small circle the diagram was completed by drawing m 
the lines of regression In precisely the same way the coi relation 
between three variables may be represented by a model showing the 
distribution of points in space; for any set of observed values JCj, 
JTg, -Zg may be regaided as determining a point m space, just as 
any pair of values and Z ‘2 may be regarded as determining a 
point in a plane. Fig 45 is drawn from such a model, constructed 
from the data of Example i Four pieces of wood are fixed together 



246 


THEORY OP STATISTICS. 


like the bottom and three sides of a box Supposing the open 
side to face the observer, a scale of paupeiism is drawn vertically 
upwards along the left-hand angle at the back of the * box, the 




Fig 45. — Model illustrating the Correlation between thiee Variables (1) 
Pauperism (percentage of the population in receipt of Poor law relief) , 
(2) Out-reliet ratio (numbers given relief in their homes to one in the 
workhouse); (3) Average Weekly Earnings of agiucultural labourers, 
(data pp. 178 and 189) A, front view , J5,"view of model tilted till the 
plane of regression for paupensm on the two lemaining variables is seen 
as a straight line 




Xn.— PARTIAL CORRELATION 


U1 


scale starting from zero, as very small values of pauperism occur : 
a scale of out-relief ratio is taken along the angle between the 
back and bottom of the box, starting from zero at the left finally, 
the scale of earnings is drawn out towards the observer along the 
angle between the left-hand side and the bottom, but as earnings 
lower than 12s do not occur, the scale may start from 12s at the 
corner Suitable scales are paupeiism, 1 in = 1 per cent , out- 
rehef ratio, 1 m = 1 unit , earnings, 1 in = Is , and the inside 
measures of the model may then be 17 in x 10 in x 8 in. high, 
the dimensions of the model constructed Given these three 
scales, any set of observed values determine a point within the 
“box ’’ The earnings and out-relief ratio for some one union are 
noted first, and the corresponding point maiked on the baseboard , 
a steel wire is then inserted vertically in the base at this point 
and cut off at the height corresponding, on the scale chosen, to 
the pauperism in the same union, being finally capped with a 
small ball or knob to mark the “point” clearly The model 
shows very well the general tendency of the pauperism to be the 
higher the lower the wages and the higher the out-relief, for the 
highest points lie towards the back and right-hand side of the 
model If some representation of all thiee equations of regression 
were to be inserted in the model, the result would be rather 
confusing , so the most important equation, viz the second, giving 
the average rate of pauperism in terms of the other variables, may 
be chosen This equation represents a plane the lines m which 
it cuts the right- and left-hand sides of the “ box should be 
marked, holes drilled at equal intervals on these lines on the 
opposite sides of the box (the holes facing each other), and threads 
stretched through these holes, thus outlining the plane as shown 
m the figure. In the actual model the correlation-diagrams (like 
fig. 40) corresponding to the three pairs of variables were drawn 
on the back sides and base they represent, of course, the eleva- 
tions and plan of the points. 

The student possessing some skill in handiciaft would find it 
worth while to make such a model for some case of interest to 
himself, and to study on it thoroughly the nature of the plane of 
regression, and the relations of the paitial and total correlations. 

16 If we write 

n~ crf(l — n)) • • • (13) 

it may be shown that n) is the correlation between 

and the expression on the right-hand side of the regression- 
equation, say ^123 wj where 

^123 <1 = ^12 34 n‘^2'i"^13 24 n • ^3 4“ • • • + 23 (n-l) • • (^4) 



248 


THEORY OF STATISTICa 


For we have 

23 . . n) = ^ 23 n) = -^(<7-1 - O-i 28 , , n) 

and also 

23 n)~ ^(^1 ”■ 23 , n)^ ~ ^ 23 . n) 

whence the correlation beWeen and 23 n is 

(oi-aiis n)‘ 

> 

i.e. the value of i^Kas «) gi^en by ( 13 ). The value of B is 
accordingly a useful datum as indicating how closely x^ can 
be expressed in terms of a linear function of x^, x^, , , . x^, and 
the values of the regressions may be regarded as determined 
by the condition that B shall be a maximum Its value is 
essentially positive as the product-sum ^(xi ^ is positive. 

B maybe termed a coefficient of (7i-l)-fold (or double, triple, 
etc.) con-elation; for n variables there are n such correlations, 
but in the limiting case of two variables the two are identical 
The value may be readily calculated, either from o-j 23 « and 

(Tj or directly from the equation 

1 “ -^(23 n) ~ (1 *" ^2)(1 ” ^3.2)(1 “ ^4 23 ) • * * 23 (n -1)) 

It is obvious from this equation that since every bracket on 
the right is not greater than unity, 

1 ” -^ 1(23 1 “ ^2’ 

Hence Bi^^ „j cannot be numerically less than For the 
same reason, rewriting ( 15 ) m every possible form, B^^ ^ 
cannot be numerically less than rjg, ^is, ... ^ e any one 

of the possible constituent coefficients of order zero Further, 
for similar reasons, ^1(23 n) cannot be numerically less than 
any possible constituent coefficient of any higher order That 
IS to say, Bi^ „) is not numerically less than the greatest 
of all the possible constituent coefficients, and is usually, though 
not always, markedly greater Thus in Example i, i^gds) 
(the coefficient of double correlation between pauperism on 
the one hand, out-relief and labourers’ earnings on the other) 
IS 0 839 , and the numerically greatest of the possible constituent 
coefficients is Again, in Example 11., ^1(234, is 

0 * 626 , and the numencally greatest of the possible constituent 
coefficients is +0 573 . 

The student should notice that B is necessarily positive. 
Further, even if all the variables X^, Xg, . . . . X„ were strictly 
uncorrelated in the original universe as a whole, we should expect 
^12> ^13 2> ^14 281 to exhibit values (whether positive or negative') 



XII.— PAKTIAL CORRELATIOK. 


249 


differing from zero in a limited sample. Hence, E will not 
tend, on an average of such samples, to be zero, but will 
fluctuate round some mean value This mean value will 
be the greater the smaller the number of observations in the 
sample, and also the greater the number of variables When 
only a small number of observations are available it is, 
accordingly, little use to deal with a large number of variables 
As a limiting case, it is evident that if we deal with n variables 
and possess only n observations, all the partial correlations 
of the highest possible order will be unity. 

17. It is obvious that as equations (11) and (12) enable us to 
express regressions and correlations of higher orders m terms of 
those of lower orders, we must similarly be able to express the 
coefficients of lower in terms of those of higher orders Such 
expressions are sometimes useful for theoretical work. Using the 
same method of expansion as m previous cases, we have 


0 — ^(iCj 23 n • 

= 2 (a;j , X2,jii 

^2 34 (n-l)) 

(n-l)) 61234 

n ^(^2 • ^2 84 

(n-l)) 


”■ 6i,i.23 , . 

. , (n-l) ^(^n • ^2 34 

(n-l)) 

That is, 

612,34 (n-l) *= 612 34 

n + 6i,i 23 

(n-l) 6„2 34 

(n-D* 


In this equation the coefficient on the left and the last on the 
right are of order n ~ 3, the other two of older ~ 2 We therefore 
wish to eliminate the last coefficient on the right Interchanging 
the suffixes 1 for n ana for 1 , we have 


S4 (n-l) — ^n2 13 . (n~l) • + ^nL23 «-l) • ^12 34 (n-l)* 

Substituting this value for ^^6 first equation we 

have 


1 + ^1, 


(n-l) - 




(n-l) ♦ ^n2 13 


(n-l) 


(n-l) 6 „i 23 


(n-l) 


(16) 


This is the required equation for the regressions , it is the equation 


^12- 


- ^ 

1 2 


6n2l 

6 nl2 


With secondary suffixes 34 (^”1) added throughout. The 

corresponding equation for the coi relations is obtained at once 
by writing down equation (16) foi 62134 (n-i) and taking the 

square root of the product (c/. § 13) , this gives 


^12 84 


__^12 34 n*l"^ln23 .(n-l) *^2/1 18 


(n-l) 

(»-!,)* 


■ (17) 



250 


THEORY OF STATISTICS. 


which is similarly the equation 


^12 n "1" ^In 2 ^2n 1 

With the secondary suffixes 34 . . . (?i - 1) added throughout 
18 Equations (12) and (17) imply that certain limiting 
inequalities must hold between the correlation-coefficients m 
the expression on the right m each case in order that real 
values (values between ± 1) may be obtained for the correlation- 
coefficient on the left These inequalities correspond precisely 
with those “conditions of consistence” between class-frequencies 
with which we dealt m Chapter II , but we propose to treat them 
only briefly here. Writing (12) in its simplest form for 
we must have or 


that is, 


(^I2~'^i3 ^ 2 3 )^ 


+ ^13 + ^3 - 2^12^3^23 < 1 


. (18) 


if the three r’s are consistent with each other If we take 
as known, this gives as limits for 


Viz ± n/1 - - ^13 + 


Similarly writing (17) in its simplest form for teiras of 

^3 2 ^ aod r 23 i, we must have 

^2.3 + ^S2 2ri2 3ri32?'23l“^ 1 • * (1^) 


and therefore, if and r^gg are given, ? 2 ri oaust lie between 
the limits 


"" ‘^123^132 i n/1 — ^2 S "■ ^3 2 4” ^2.3^13 -* 


The following table gives the limits ot the third coefficient in 
a few special cases, for the three coefficients of zero order and 
of the first order respectively — 


Value of 

Limits of 

n 2 or ri 2 3 

ris or ri 3 2 

^23. 

^231 

0 

0 

±1 

+ 1 

+1 

+ 1 

-f-1 

-1 

+1 

+ 1 

-1 


+ V0 5 

±Vo r> 

0, +1 

0 , -1 

iVTs 

+ V0 5 

0, -1 

0, +1 




XII.— PARTIAL OORRBLATIOH. 


251 


The student should notice that the set of three coefficients of 
order zero and value unity are only consistent if either one only, 
or all three, are positive, %.e 4 - 1 , + 1 , + 1 , or - 1 , - 1 , + 1 , hut 
not - 1 , - 1 , - 1 . On the other hand, the set of three coefficients 
of the first order and value unity are only consistent if one only, 
or all three, are negative the only consistent sets are + 1 , + 1 , 

1 and -1, - 1, - 1 The values of the two given r^s need to 
be very high if even the sign of the third can be inferred ; if the 
two are equal, they must be at least equal to \/0 5 or *707 . . . . 
Finally, it may be noted that no two values for the known 
coefficients ever permit an inference of the value zero for the 
third , the fact that 1 and 2, 1 and 3 are uncorrelated, pair and 
pair, permits no inference of any kind as to the correlation 
between 2 and 3, which may lie anywhere between 4 - 1 and - 1. 

19. We do not think it necessary to add to this chapter a 
detailed discussion of the nature of fallacies on which the theory 
of multiple correlation throws much light The general nature of 
such fallacies is the same as for the case of attributes, and was 
discussed fully in Chap lY §§ 1-8 It suffices to point out the 
principal sources of fallacy which are suggested at once by the 
form of the partial correlation 


=. ^12 ” ^13 * 


(a) 


and from the form of the corresponding expression for terms 
of the partial coefficients 


• = ^12 3 't ^13 2 ^28 1 

\/(l -^132X1 -^231) 


(^) 


From the form of the numeiator of (a) it is evident (1) that even 
if ^’123 2 :ero unless either or 733 , or 

both, are zero If and ^33 are of the same sign the partial 
con elation will be negative, if of opposite sign, positive Thus 
the quantity of a crop might appear to be unaffected, say, by 
the amount of rainfall during some period preceding harvest • 
this might be due meiely to a correlation between ram and low 
temperature, the partial correlation between crop and rainfall 
being positive and important We may thus easily misinterpiet 
a coefficient of correlation which is zero ( 2 ) 3 may be, indeed 

often IS, of opposite sign to and this may lead to still more 
serious errors of interpretation 

From the form of the numerator of (b\ on the other hand, we 
see that, conversely, will not be zero even though 3 is zero, 
unless either ^ or r 23 ^ is zero. This corresponds to the theorem 



252 


THEORY OF STATISTICS. 


of Chap. lY. § 6, and indicates a source of fallacies similar to 
those there discussed 

20 We have seen (§ 9) that is the correlation between 3 
and^Tgg, and that we might deteimine the value of this paitial 
correlation by drawing up the actual coi relation table for the two 
residuals in question. Suppose, however, that instead of drawing 
up a single table we drew up a series of tables for values of g 
and a?23 associated with values of lying within successive 
class-intervals of its range. In general the value of 3 would 
not be the same (or approximately the same) for all such tables, 
but would exhibit some systematic change as the value of 
increased Hence 3 should be regarded, m general, as of the 
nature of an average correlation the cases m which it measures 
the correlation between x^^ and x^^ for every value of x^ (cf. 
Chap XYI,) are probably exceptional. The process for deter- 
mining partial associations {cf Chap. TY ) is, it will be remembered, 
thorough and complete, as we always obtain the actual tables 
exhibiting the association between, say, A and B m the universe 
of (7s and the universe of fs that these two associations may 
differ materially, is illustrated by Example i. of Chap lY 
(pp. 45-6) It might sometimes serve as a useful check on 
partial-correlation work to reclassify the observations by the 
fundamental methods of that chapter For the general case an 
extension of the method of the correlation-ratio ” (Chap X , § 20) 
might be useful, though exceedingly laborious It is actually 
employed in the paper cited in ref. 7 and the theory more fully 
developed m ref. 8. 

REFERENCES. 

The preceding chapter is written from the standpoint of refs 8 and 4, and with the 
notafaon and method of ref 5. The theory of con elation foi several variables was 
developed by Edgeworth and Pearson (refs 1 and 2) from the standpoint of the “ normal ” 
di8tnbationoffreq^uency(c/.Chap XVI). 


Theory 

<1) Ei><3Ewoeth, F Y , “On Correlated Averages,” PHI Mag , 6th Senes, vol xxxiv , 

lovA, p ly4 ' 

(2) PB43S0N, Karl, “ Regression, Heredity, and Panmixia,” Phil Trans Rov Soc , 

Senes A, voL clxxxvii , 1896, p, 263 ’ 

(3) Yule, G U , “ On the Significance of Bravais’ Formulas for Regression, etc , in the 

case of Skew Correlation,” P? oc 5oc, vol lx, 1897, p 477 

(4) YuM,^a U., “On the Theory of Correlation," Jour Roy Stat Soc , vol. lx., 1897, 

(5) Yule, G. TJ , “On the Theory of Correlation for any number of Variables treated 

Proc Roy Soc , Series A, vol Ixxix , 1907, p 182 

(6) R H and G U Yule, “Kote on Estimating the Relative Influence of 
Two Vanables uwn a Third, Jour Roy Stat Soc , vol Ixix , l9ok p 197 

(7) Beown, J W , M Greenwood, and Franobs Wood, “A study of Index-Corre- 

lationsj Jour Roy Stat fibc , vol Ixxvii , 1914, pp. 317-46 (The partial or 
“solid ‘’correlation-ratio IS used ) i/aiwai ur 

(S)Iss^MS^ K^“^On Correlation-Ratio, Pt. I Theoretical,” Riometnka, 



XII. — PARTIAL CORRELATION. 


253 


lUnstrative Applications of Economic Interest. 

(9) YxHjB, G U , “ An Investigation into the Causes of Changes in Pauperism in England, 
etc Jour Roy Stat Soe , vol Ixii , 1899, p 249. 

(10) flooKEE, R H , “The Correlation of the Weather and the Crops,” Jowr. Roy, Stat. 

Soc , vol Ixx , 1907, p 1 

(11) Snow, E C “ The Application of the Method of Multiple Correlation to the Estima* 

tion of Post ceusal Populations,’* Jour Roy Stat Soc , vol. Ixxiv , 1911, p. 675. 

EXERCISES. 

1 (Ref. 10. ) The following means, standard-deviations, and correlations are 
found for 

Xi = seed-hay ciops m cwts. per acre, 

X2 = spring rainfall in inches, 

accumulated temperature above 42 “ F in spring, 
in a certain district of England during 20 years 

i/i = 28 02 0*1 = 4 42 ri2=+0 80 

if 2=4 91 0*2=110 ri 3 =- 0*40 

Af3 = 694 o’8=85 ^23^ —0*56 

Find the partial correlations and the regression-equation foi hay-crop on spring 
rainfall and accumulated tempeiature 

2 (The following figures must be taken as an illustiation only . the data 
on which they were based do not refer to unifoim tunes or areas, ) 

Xi* deaths of infants under 1 year pei 1000 births m same year (infantile 
mortality). 

X2=propoition per thousand of married women occupied for gam, 

X3=: death -rate of persons ovei 6 years of age per 10,000 
X4=propoition per thousand of population living 2 or more to a room 
(overciowding). 

Taking the figures below for SO urban areas in England and Wales, find the 
partial coiTelations and the regression-equation for infantile moi tality on the 
other factors 


ifl=:164 

0*1= 20 0 

ri2= HhO'49 

r23= -f 0 16 

i/ 2=158 

0*2= 74 9 

7*13 ~ "I" 0 78 

^24= -0 37 

Ar3=143 

0*3= 22 4 

ri4= -hO 20 

7*34= -j- 0 23 

A/4=206 

0*4 = 130 0 


3 If all the correlations of order zero are equal, say = r, what are the values 
of the partial conelations of successive oiders ^ 

Under the same condition, what is the limiting value of r if all the equal 
correlations aie negative and n vaiiables have been observed ? 

4 . What IS the correlation between Xi 3 and ^21 ^ 

5 . Write down fiom inspection the values ot the partial correlations for the 
three variables 

Xj, Xg, and Xi+b X2 

Check the answer to Qu 7 , Chap. XL, by working out the partial 
correlations 

6. If the relation 

& X2-{-c x^=i0 

holds for all sets of values of and iCg, what must the partial correlations 
be? 

Cheek the answer to Qu. 9 , Chap XI., by working out the partial 
correlations 



PART III —THEORY OF SAMPLING. 


CHAPTER XIII. 

SIMPLE SAMPLING OP ATTRIBUTES. 

1 The problem of the present Part— 2 The two chief divisions of the theory 
of sampling — 3. Limitation of the discussion to the case of simple 
sampling— 4 Definition of the chance of success or failure of a given 
event— 5 Determination of the mean and standard-deviation of the 
number of successes in n events — 6 The same for the proportion of 
successes m n events the standard deviation of simple sampling as a 
measure of unreliability, or its reciprocal as a measure of precision — 7 
Verification of the theoretical results by experiment — 8, Slore detailed 
discussion of the assumptions on which the formula for the standard- 
deviation of simple sampling is based — 9-10 Biological cases to 
which the theory is directly applicable — 11 Standard-deviation of 
simple sampling when the numbers of observations in the samples 
vary — 12 Approximate value of the standard-deviation of simple 
samplmg, and relation between mean and standard -deviation, when 
the chance of success or failure is very small — 13. Use of the standard- 
deviation of simple sampling, or standard error, for checking and 
controlling the interpretation of statistical results. 

1 On several occasions in the preceding chapters it has been 
pointed out that small differences between statistical measures like 
percentages, averages, measures of dispersion and so forth cannot 
m general be assumed to indicate the action of definite and assign- 
able causes Small differences may easily arise fiom indefinite 
and highly complex causation such as detei mines the fluctuating 
proportions of heads and tails in tossing a coin, of black halls in 
drawing samples from a hag containing a mixture of black and 
white halls, or of cards bearing measurements within some given 
class-interval m drawing cards, say, from an anthropometric record. 
In 100 throws of a com, for example, we may have noted 56 heads 
and only 44 tails, but we cannot conclude that the com is biassed . 
on repeating our throws we may get only 48 heads and 52 tails. 
Similarly, if on measuring the statures of 1000 men in each of 
two nations we find that the mean stature is shghtly greater for 

254 



XIII. — SIMPLE SAMPLING OF ATTRIBUTES. 


255 


nation A than for nation we cannot necessarily conclude that 
the real mean stature is greater in the case of nation A • possibly 
if the observations were repeated on different samples of 1000 
men the ratio might be reversed 

2 The theory of such fluctua tions may b e turned the theory^ 
of there are two cEieFiections of the theory corre- 

sponding to the theory of attributes and the theory of variables 
respectively In tossing a com we only classify the results of the 
tosses as heads or tails ^ in drawing balls from a mizture of black 
and white balls, we only classify the balls drawn as black or as 
white These cases correspond to the t heory of attributes, and 
the general case may be represented as the dramh^^oFa”^^ 
from a universe contaimng both A^s and a^s, the numbei or 
proportion of ^’s m successive samples bemg observed. If, on the 
other hand, we put in a bag a number of cards bearing different 
values of some variable X and draw sample batches of cards, we 
can form averages and measures of dispersion for the successive 
batches, and these averages and measures of dispersion will vary 
shghtly from one batch to another. If associated measures of 
two variables X and F are recorded on each card, we can also form 
correlation-coefficients for the different batches, and these will vary 
in a similar manner. These cases correspond to the^theory of 
variables, and it is the function of the theory of sampling for such 
cases'^ inform us as to the fluctuations to be expected in the 
averages, measures of dispersion, correlation-coefficients, etc , in j 
successive samples In the present and the three following] 
chapters the theory of sampling is dealt with for the case of 
attributes alone The theory is of great importance and mteiest, 
not only from its applications to the checking and control of 
statistical results, but also from the theoretical forms of frequency- 
distribution to which it leads Finally, in Chapter XVII one or 
two of the more important cases of the theory of sampling for 
variables are briefly treated, the greater part of the theory, owing 
to its difficulty, lying somewhat outside the limits of this work 
3. The theory of sampling attains its greatest simplicity if 
every 6bserva“tion contiibuted to the sample may be regaided as 
inde^ndent of every other This condition of independence 
holds good, e for the tossing of a com or the throwing of a die * 
the result of any one throw or toss does not affect, and is un- 
affected by, the results of the preceding and following tosses. 
It does not hold good, on the other hand, for the drawing of balls 
from a bag if a ball be drawn from a bag containing 3 black 
and 3 white balls, the remainder may be either 2 black and 3 
white, or 2 white and 3 black, according as the first ball was 
black or white The r esul t of drawing a second, ball is therefore 



256 


THEOEY OF STATISTICS. 


dependent on the result of drawing the first. The disturbance 
cah“t 5 iil 7 “‘be eliminated by " drawing fiom a bag containing a 
number of balls that is infinitely large compared with the 
total number drawn, or by returning each ball to the bag before 
drawing the next In this chapter our attention will be confined 
to the case of independent sampling, as m com-tossing or dice- 
throwing— the simplest cases of an artificial kind suitable for 
theoretical study and experimental verification. For brevity, we 
may refer to such cases of sampling as simple s ampling . the 
implied conditions are discussed more fully in § 8 belowr””'*" 

4. If we may regard an ideal com as a uniform, homogeneous 
circular disc, there is nothing which can make it tend to fall more 
often on the one side than on the other ^ we may expect, there- 
fore, that in any long series of throws the com wiU fall with 
either face uppermost an approximately equal number of times, 
or with, say, heads uppermost approximately half the times. 
Similarly, if we may regard the ideal die as a perfect homogeneous 
cube, it will tend, m any long series of throws, to fall with each 
of its SIX faces uppermost an approximately equal number of 
times, or with any given face uppermost one-sixth of the whole 
number of times These lesults are sometimes expressed by 
saying that the <^ance of throwing heads (or tails) with a com is 
1/2, and the c^nce of throwing six (or any other face) with a die 
18 1/6. To avoid speaking of such particular instances as coins 
or dice, we shall m future, using terms which have become 
conventional, refer to an event the chance of success of which is jp 
ai^ the^chmice of failwre Obviously p + q^l, ^ 

T Suppose we take samples with n events m each. What 
will be the values towards which the mean and standard-deviation 
of the number of successes m a sample will tend ? The mean is 
given at once, for there are JSF n e vents, of which approximately 
pWn will be successes, and tEe*1meah"’humber of successes in a 
sample will therefore tend towards pn As regards the standard- 
deviation, consider first the singTS^^ent (w=l). The single 
event may give either no successes or one success, and will tend 
to give the former gW, the latter pF^ times in F trials Take 
this frequency-distribution and work out the standard-deviation 
of the number of successes for the single event, as m the case of 


an anthmetical example — 



Frequency /, Successes | 

A- 

ft- 

qF 0 

— 


pF 1 

pN 

pN 

F ^ 

pN 

pN 



XIII — SIMPLE SAMPLING OF ATTRIBUTES. 


257 


We have therefore and . 

But the number of successes in a group of n such events is the 
sum of successes for the single events of which it is composed, 
and, all the events being independent, we have therefore, by the 
usual rule for the standard -deviation of the sum of independent 
variables (Chap XI. § 2, equation (2)), o*„ being the standard- 
deviation of the number of successes in n events, 

ai-=npq ( 1 ) 

This is an equation of fundamental importance in the theory of 
sampling The student should particularly bear in mind that ' 
the standard-deviation of the number of successes, due to 
fluctuations of simple sampling alone, in a group of n events 
varies, not directly as n, but as the square root of n. , 

6. In heu of recording the absolute number of successes in each 
sample of n events, we might have recorded the proportion of 
such successes, i e l/^ith of the number m each sample. As this 
would amount to merely’ dividing all the figures of the original 
record by n, the mean proportion of successes — or rather the value 
towards which the mean tends to approach — must be and the 
standard-deviation of the proportion of successenT^ST^ven by 

. . . . ( 2 ) 

The standard-deviation of the proportion of successes in samples 
of such independent events varies therefore inversely as the square 
root of the number on which the proportion is calculated. Xowj 
if we regard the observed proportion m any one sample as a^ 
more or less unreliable determination of the true proportion in 
a very large sample from the same material, the standard-devia- 
tion of sampling may fairly be taken as a measure of the 
wwreliahi hty of the determinatio n— the greater the stand ard- 
deviaSon, the fluctuations of the observed proportion, 

although the true proportion is the same throughout. The 
reciprocal of the standard-deviation (I/ 5 ), on the other -hand, may 
be "^regarded as a measure of reltahiht'^^ or , as it is sometimes 
termed, precision, and consequenlIy*’^!erfi?m6^Z^^^ or precision of 
an ohser^^pwpuHt6% varies as the square roof^ of the number of 
obtervuttons on which it is based. This is againla very important 
rule witK many practical applications, but the limitations of the 
case to which it applies, and the exact conditions from which it 
has been deduced, should be borne in mind We return to this 
point again below (§ 8 and Chap XIV ) 

7 Experiments in com tossing, dice throwing, and so forth 
have been carried out by various persons in order to obtain ex- 

17 



258 


THEORY OF STATISTICS. 


perimental verification of these results The following will serve 
as illustrations, but the student is strongly recommended to 
carry out a few series of such experiments personally, in order to 
acquire confidence m the use of the theory It may be as well 
to remark that if ordinary commercial dice are to be used for the 
trials, care should be taken to see that they are fairly true cubes, 
and the marks not out very deeply Cheap dice are generally 
very much out of truth, and if the marks are deeply cut the 
balance of the die may be sensibly affected. A convenient mode 
of throwing a number of dice, suggested, we believe, by the late 
^ Professor Weldon, is to roll them down an inclined gutter of 
corrugated paper, so that they roll across the corrugations. 

(1) (W F R. Weldon, cited by Professor F Y Edgeworth, 
Encycl Bnt ^ 11th edn , vol xxii. p. 394 Totals of the columns 
in the table there given.) 

Twelve dice were thrown 4096 times , a thiow of 4, 5, or 6 points 
reckoned a success, therefore p = = Theoretical mean 6 ; 
theoretical value of the standard-deviatton \/05x05xl2 = 
1 732 

The following was the frequency distribution observed — 

Successes Frequency 

7 847 

8 536 

9 257 

10 71 

11 11 

12 -- 

Total 4096 

Mean M= 6*139, standard-deviation o*= 1 712. The piopoition of 
successes is 6 139/12 = 0 512 instead of 0 5 

(2) (W. F R Weldon, loc cit , p 400. Totals of columns of 
the table given ) 

Twelve dice were thrown 4096 times ; only a throw of 6 was 
counted a success, so p — 1/6, q^5/6. Theoretical mean If =2, 
standard-deviation cr= Jlj6~xb/6l<T2 == I 291, 

The following was the olDserved frequency-distribution . — 


Successes 

Frequency 

Successes 

Frequency 

0 

447 

5 

115 

1 

1145 

6 

24 

2 

1181 

7 

7 

3 

796 

8 

1 

4 

380 

Total 

40^ 




xni. — SIMPLE SAMPLING OF ATTKIBUTES. 


259 


Mean 2 000, standard-deviation <r= 1 296. Actual proportion 
of successes 2 00/12 = 0*1667, agreeing with the theoretical value 
to the fourth place of decimals Of course such very close 
agreement is accidental, and not to he always expected. 

(3) (G U Yule ) The following may be taken as an illustra- 
tion based on a smaller number of observations Three dice were 
thrown 648 times, and the numbers of 5’s or 6’s noted at 
each throw jd=1/3, q=2JB, Theoretical mean 1. Standard- 
deviation, 0 816 

Frequency-distnbution obseived : — 

Succeisses. Frequency. 

0 179 

1 298 

2 141 

3 30 

Total 648 

1 034, <T — 0 823 Actual pioportion of successes 0 345 

For other illustrations, some of which are cited in the questions 
at the end of this chapter, the student may be referred to the 
list of references on p 273. The student should notice that in 
all the distributions given a range of six times the standard- ' 
deviation includes either all, or the great bulk of, the observations, 
as in most frequency-distributions of the same general form. We 
shall make use of this rule below, § 13. 

8 In deducing the formulse (1) and (2) for the standard 
deviations of simple sampling in the cases with which we have 
been dealing, only one condition has been explicitly laid down as 
necessary, viz the independence of the severa l drawings, tossings, 
or other events comp^osihg ’HEe'sample point of fact this 

is'liot the only nor tKe most fundamental condition which has 
been explicitly or implicitly assumed, and it is necessary to realise 
all the conditions in order to grasp the limitations under which 
alone the formulae arrived at will hold Supposing, for example,/' 
that we observe among groups of 1000 persons, at different times 
or in different localities, various percentages of individuals, 
possessing certain characteristics — dark hair, or blindness, or 
insanity, and so forth Under what conditions should we^ 
expect the observed percentages to obey the law of sampling 
that we have found, and show a standaid-deviation given by 
equation (3) ? 

(a) In the first place we have tacitly assumed throughout the 
preceding work that our dice or our coins were the same set or 



260 


THEORY OF STATISTICS. 


identically similar throughout the experiment, so that the chance 
of throwing “ heads '' with the coins or, say, “ six with the dice 
was the same throughout we did not commence an experiment 
with dice loaded in one way and later on take a fresh set of dice 
loaded in another way. Consequently if formula (2) is to hold 
good in our practical case of sampling there must not be a 
difference in any essential respect — t e in any character that can 
affect the proportion observed — between the localities from which 
the observations are drawn, nor, if the observations have been 
made at different epochs, must any essential change have taken 
place during the period over which the observations are spread 
Where the causation of the character observed is more or less 
unknown, it may, of course, be difficult or impossible to say what 
differences or changes are to be regarded as essential, but, where 
we have more knowledge, the condition laid down enables us to 
exclude certain cases at once from the possible applications of 
formula (1) or (2) Thus it is obvious that the theory of simple 
sampling cannot apply to the variations of the death-rate m 
localities with populations of different age and sex compositions, 
nor to death-rates in a mixture of healthy and unhealthy districts, 
nor to death-rates in successive years during a period of con- 
tinuously impiovmg sanitation. In all such cases vaiiations 
due to definite causes are supei-posed on the fluctuations of 
sampling 

(h) In the second place, we have also tacitly assumed not 
only that we were using the same set of coins or dice throughout, 
so that the chances p and q were the same at every trial, but 
also that all the coins and dice in tlm set_ used were identical 
similar, so that’ the” chances jb*and ^ were tlTe same for every'’‘coin 
or die Consequently, if our formul£e are to apply in the practical 
case of sampling, t he condition s tha t regulate the appearance of 
the character observed ^mi^t not onlyTie the same for every 
samgeT but also for every individu^ iiT'ev lery saTm^^ This is 
again a very marked .limitation To revert "to tET case of death- 
rates, formulae (1) and (2) would not apply to the numbers of 
persons dying in a senes of samples of 1000 persons, even if these 
samples were all of the same age and sex composition, and living 
under the same sanitary conditions, unless, further, each sample 
only contained persons of one sex and one age For if each 
sample included persons of both sexes and different ages, the 
condition would be bioken, the chance of death during a given 
period not being the same for the two sexes, nor for the young 
and the old The gioups would not be homogeneous in the sense 
required by the^Hdittons" from which our formulae have been 
deduced Similarly, if we were observing hair-colours, our formulae 



XIII. — SIMPLE SAMPLING OF ATTRIBUTES 


261 


would not apply if the samples were compounded by always] 
taking one person from district A, another from district B, and I 
so on, these districts not being similar as regards the distribution j 
of hair-colour. 1 

The above conditions were only tacitly assumed in our previous 
work, and consequently it has been necessary to emphasise them 
specially The third condition was explicitly stated, (c) The 
individual “events,” or appearances of the character observed, 
must be completely independent of one another, like the" throwS^ 
of a die, or sensibly so, like the drawings of balls from a bag 
containing a number of balls that is very large compared with 
the number drawn Reverting to the illustration of a death-rate, 
our formulae would not apply even if the sample populations 
were composed of persons of one age and one sex, if we were 
dealing, for example, with deaths from an infectious or contagious 
disease. For if one person in a certain sample has contracted 
the disease in question, he has increased ihe possibility of others 
doing so, and hence of dying from the disease The same thing 
holds good for certain classes of deaths from accident, e g railway 
accidents due to derailment, and explosions m mines . if such anf 
accident is fatal to one person it is probably fatal to others also, J 
and consequently the annual returns show large and more or 
less erratic variations 

When we speak of smple samphng in the following pages, the 
term is intended to imply the fulfilment of all the conditions (a), 
{h\ and (c), all the samples and all the individual contributions to 
each sample being taken under precisely the same conditions, 
and the individual “ events ” or appearances of the character being 
quite independent It may be as well expressly to note that we 
need not make any assumption as to the conditions that determine 
p unless we have to estimate s/npq a priori. If we draw a 
sample and observe in it the '^ctlial proportion of, say, A^a : 
draw another sample under precisely the same conditions, and 
observe the proportion of ^^s in the two samples together add 
to these a third sample, and so on, we will find that p approaches 
— not continuously, but with some fluctuations — closer and closer 
to some limiting value It is this limiting value which is to be 
used m our formulae — the value of p that would be observed in 
a very large sample The standard-deviation of the numb er of ^ 
si^es thrown with n dice, on this understanding, may be 
even If the" dice be out of truth or loaded so that p is no longer 
1/6 Similarly, the standard-deviation of the numb er of black 
kalla in samples of n drawn from an infinitely large mixture of 
black and white balls m equal proportions may be Jnpq even 



262 


THEORY OF STATISTICS. 


if f IS, say, 1/3, and not 1/2 owing to the black balls, for some 
reason, tending to slip through our fingeis (Of. Chap XIV. 

§^) 

1 9 It is evident that these conditions very much limit the 

field of practical cases of an economic or sociological character 
‘ to which formulse (1) and (2) can apply without considerable 
! modification The formulae appear, however, to hold to a high 
degree of approximation in certain biological cases, notably m 
the proportions of offspring of different types obtained on crossing 
hybnds, and, with some limitations, to the proportions of the 
two sexes at birth It is possible, accordingly, that in these cases 
all the necessary conditions are fulfilled, but this is not a necessary 
inference from the mere applicability of the formulae {cf Chap. 
XI Y § 15) In the case of the sex-ratio at birth, it seems 
doubtful whether the rule applies to the frequency of the sexes in 
individual families of given numbers (ref 9), but it does apply 
fajrly closely to the sex-ratios of births m different localities, 
and still more closely to the ratios in one locality during 
successive periods That is to say, if we note the number of 
males m a series of groups of n births each , the standard-deviation 
of that number is approximately where p is the chance 

of a male birth ; or, otherwise, Jpq/n is the standard-deviation 
of the proportion of male births " We are not able to assign an 
a pnon value to the chance p as m the case of dice-throwing, 
but it IS quite sufficiently accurate for practical purposes to use 
fehe proportion of male births actually observed if that proportion 
be based on a moderately large number of observations. 

10. In Table VI of Chap IX. (p 163) was given a correlation- 
table between the total numbers of births in the registration districts 
of England and Wales dunng the decade 1881-90 and the pro- 
portion of male births The table below gives some similar figures, 
based on the same data, for a few isolated groups of districts con- 
taining not less than 30 to 40 districts each In both tables the 
drop in dispersion as we pass from the small to the large districts 
IS extremely striking The actual standard-deviations, and the 
standard-deviations of simple sampling corresponding to the mid- 
numbers of births, are given at the foot of the table, and it will 
be seen that the two agree, on the whole, with surprising closeness, 
considering the small numbers of observations. The actual 
standard-deviation is, however, the larger of the two in every case 
Iffiff ^dneT"^^ coffespondlng standard-deviations for Table VI. of 
'TJEapriX are given in Qu 7 at the end of this chapter, and show 
the same general agreement with the standard-deviations of simple 
sampling , the a ct ual standard-deviations are, however, again, as 
a rule^ slightly in excess of the theoretical values ' , 



XIII. — SIMPLE SAMPLING OF ATTEIBUTES. 


263 


Table showing Frequencies of Registration Districts in England and Wales with 
Different Ratios of Male to Total Births daring the Decade 1881-90, for 
Groups of Districts with the Numbers of Births in the Decade lying between 
Certain Limits [Data based on Decennial Supplement to Fifty ffftk Annual 
Report of the Registrar -General for England and Wales ] 


Male Births 
per Thousand 
Total Biiths 


Number of Births in Decade 

1 

1500 

to 

2500 

3500 

to 

4000 

4500 

to 

5000 

10,000 

to 

15,000 

15.000 
to 

20.000 

30.000 
to 

50.000 

50,000 

to 

90,000. 

466-67 

1 

— 


— 

— 

— 


482- 3 

1 

— 

— 

— 

— 

— 

— 

492- 3 

1 


1 







__ 

494- 5 

1 



1 


— 

— 



496- 7 

2 

3 

— 

— 

— 

— 



498- 9 

— 

1 

— 

— 

— 

1 



600- 1 

2 

4 

2 

1 



— 



502- 3 

3 

3 

3 

3 

— 

— 



604- 5 

3 

1 

3 

10 

4 

4 

6 

606- 7 

5 

5 

3 

6 

6 

6 

10 

508- 9 

— 

3 

3 

9 

4 

16 

12 


4 

3 

9 

15 

6 

8 

5 

512- 3 

1 

5 

2 

8 

9 

4 

2 

614- 5 

2 

2 

3 


2 

3 



516- 7 

— 

3 

3 

5 

2 

1 



618- 9 

4 

— 

3 

4 

— 

— 



520- 1 

1 

— , 

1 

— 

1 





622- 3 

2 

1 

3 

1 

— 




524- 5 

1 

2 

— 

— 






626- 7 

1 

1 

■— 

1 

— 

-- 


628- 9 

— 

— 

— 


— 

— 


630- 1 

— 

1 


— 

— 

— 

— 

632- 3 

i — 

— 

— 


1 


— 

534- 6 



— 











636- 7 

1 

— 

— 

— 

— 

— 

— 

Total 

36 

38 

40 

73 

33 

43 

35 

Mean 

508 2 

609 5 

510 2 

610 6 

510 3 

509 0 

607 8 ' 

Standard deviation s 

12 8 

8 53 

7 12 

4 98 

3*87 

3-22 

2*20 

Theo st deviations 







corresponding to 1 

11 2 

8 16 

7*25 

4 47 

3*78 

2*60 

1*89 

mean births Sq } 









6 2 

2 6 

— 

22 

08 

2 0 

1 1 










The meaning of this expression is explained in § 10 of Chap XIY, 














264 


THEORY OF STATISTICS 


The student should note that m both cases the standard-devia- 
tions given are standard-deviations of the proportion of male 
births 1000 of all lirths, that is, 1000 times the values given 
by equation (2). These values are given by simply substituting ' 
the proportions per 1000 for^ and q in the formula Thus for 
the first column of Table I. the propoition of males is 508 per 
1000 births, the mid-number of births 2000, and therefore — 



11. In the above illustration the difficulty due to the wide 
variation m the number of births n in different districts has been 
surmounted by grouping these districts m hmited class intervals, 
and assuming that it would be sufficiently accurate for practical 
purposes to treat all the districts in one class as if the sex-iatios 
had been based on the mid-numbers of births Given a sufficiently 
large number of observations, such a process does well enough, 
though it is not very good. But if the number of observations 
does not exceed, perhaps, 50 or 60 altogether, grouping is 
obviously out of the question, and some other procedure must be 
adopted. 

Suppose, then, that a series of samples have been taken from 
the same material, samples containing individuals or observa- 
tions each, /g contaimng /g containing and so on What 
would Be the standard^eviation of the observed proportions in 
j these samples? Evidently the square of the standard-deviation 
jin the first group would hQpqjn-y^ in the second and so on 
Itherefore, as the means tend to the same values in all the groups, 
we must have for the whole series — 


. ^ Tig 

But if H be the harmonic mean of ti^ Tig 


and accordingly 

( 3 ) 

That is to say, where the number of observations vanes from one 
sample to another, the harmonic mean number of observations in 
a sample must be substituted for n in equation (2). 

Thus the following percentages (taken to the nearest unit) of 




xni. — ^SIMPLE SAMPLING OF ATTRIBUTES. 


265 


albinos were obtained in 121 litters from hybrids of Japanese 
waltzing mice by albinos, crossed inter se (A D. Darbishire, 
Biometnha^ lii p. 30) : — 


Percentage. Frequency. 


0 

40 

14 

4 

17 

9 

20 

9 

22 

1 

25 

10 

29 

3 

33 

13 


ventage. 

Frequency. 

40 

3 

43 

2 

50 

16 

57 

1 

60 

3 

67 

4 

80 

1 

100 

2 


The distribution is very irregular owing to the small numbers in 
the litters, and the standard-deviation is 23 09 pei cent The 
numbers of litters of diffef^t*"^i2es‘lv*^e^g^ 27 of Chap 

VII p. 128, and the harmonic mean size of litter was found to be 
^3 5^% ^ The expected proportion of albinos is 25 per cent., md 
hence the standaid-deviation of sampling is 



in very close agreement with the actual value The proportion 
of albinos amongst all the offspring together was 24 7 per cent 
12 If one^pf the two pr(^ortions^ and q become very small, 
equation (1) may be put info an approximate form that is very 
useful Suppose p to be the proportion that becomes very small, j 
so that we may neglect compared with p then 

"n ^ ^ ^2'— JO approximately, 

and consequently we have approximately 

cr„= Jnp— sfM . . , (4) 


That IS to say, %f the proportion of mccesses he sinallj the 
standard-deviation of the number of successes is the square i oot of 
the mean number of successes Hence we can find the standard- 
deviation of sampling even though p be unknown, provided only 
we know that it is small. 

Thus (ref. 15) in 10 Prussian army corps in 20 years (1875- 
1894) there were 122 men killed by the kick of a horse, or, on an 
average, there were 0 61 deaths from that cause in each army 
corps annually. From equation (4) we accordingly have for the 
standard-deviation of simple sampling 

o*=(0 61^ = 0 78 



266 


THEORY OF STATISTICS, 


The frequency-distribution of the number of deaths per army 
corps per annum was 


whence 


Deafchs/f 

0 

1 

2 

3 

4 


Frequency 

109 

65 

22 

3 

1 


0-2 = 0 6079 
a- = 078 


— an almost exact agreement with the standard-deviation of simple 
sampling 

13 We may now turn from these verifications of the theoretical 
results for various special cases, to the use of the formulae for 
checking and controlling the interpretation of statistical results 
*If we observe, in a statistical sample, a certain proportion of 
objects or individuals possessmg some given character — say — 

this proportion differing more or less from the proportion which 
for some reason we expected, the question always arises whether 
the difference may be due to the fluctuations of simple sampling 
only, or may be indicative of defimte differences between the 
conditions in the umverse from which the sample has been drawn 
and the assumed conditions on which we based our expectation 
Similarly, if we observe a different proportion in one sample from 
that which we have observed in another, the question again arises 
whether this difference may be due to fluctuations of simple 
sampling alone, or whether it indicates a difference between the 
conditions subsisting in the universes from which the two samples 
were drawn . in the latter case the difference is often said to be 
significant. These questions can be answered, though only more 
15Fi®HTO^ly at present, by comparing the observed difference 
with the standard-deviation of simple sampling. We know 
roughly that the great bulk at least of the fluctuations of samp- 
Img lie within a range of ± three time s the sta ndardrdeYiatiQn y 
and if an observed difference""TfS£ a "theoretical result greatly 
exceeds these limits it cannot be ascribed to a fluctuation of 
“ simple sampling ” as defined in § 8 it may therefore be signifi- 
cant The “standard-deviation of simple sampling*^ being the 
basis of all such work, it is convenient to refer to it by a shorter 
name. The observed proportions of ^’s m given samples being 
regarded as differing by larger or smaller errors from the true 
proportion in a very large sample from the same material, the 



XIII — SIMPLE SAMPLING OF ATTKIBUTES. 


267 


** standard-deviation of simple sampling ” may be regarded as a 
measure of the magnitude of such errors, and may be called ac- 
cordingly the stan(&rd error. 

Three principal cases oT comparison may be distinguished 
Case I — It 18 desired "to know whether the deviation of a certain 
observed number or proportion from an expected theoretical value 
is possibly due to errors of sampling. 

In this case the observed difference is to be compared with the 
standard error of the theoretical number or proportion, for the 
number of observations contained in the sample 

Example i. — In the first illustration of § 7, 25,145 throws of a 4, 
5, or 6 were made in lieu of the 24,576 expected (out of 49,152 
throws altogether) The excess is 569 throws Is this excess 
possibly due to mere fluctuations of sampling 7 
The standard error is 

<T= X ^ X 49152 ^ 

-110 9. 

The deviation observed is 5 1 times the standard error, and, 
practically speaking, could not occur as a fluctuation of simple 
sampling It may perhaps indicate a slight bias in the dice 
The problem might, of course, have been attacked equally well 
from the standpoint of the proportion in lieu of the absolute 
number of 4^s, 6’s, or 6’s thro^wn This proportion is 0*5116 instead 
of the theoretical 0 5000^1)iifference in excess 0 0116 The 
standard error of the proportion is 

«=\/ 49 ^ = 000226 , 

and the difference observed bears the same^ ratio to the standard 
error as before, as of course it must 

Example ii — (Data from the Second Report of the Evolution 
Committee of the Royal Society, 1905, p 72.) 

Certain crosses of Pisum sativum gave 5321 yellow and 1804 
green seeds. The expectation is 25 per cent of green seeds, or 
,1781 Can the divergence from the exact theoretical result have 
arisen owing to errors of sampling only ^ 

The numerical difference from the expected result is 23 The 
standard error is 

<r- x/0 25x0 75x7125-36 8 

Hence the divergence from theory is only some 3/5 of the 
standard error, and may very well have arisen owing simply to 
fluctuations of sampling 



268 


THEOBY OF STATISTICS 


Working from the observed of green seeds, viz 0 2532 

instead of the theoretical 0 25, we have 

5= V0*25 X 0 75/7125 = 0 0051, 


and similarly the divergence from theory is only some 3/5 of the 
standard enor, as before 

It should be noted that this method must not be used as a test 
of association by companng the difference of {AB) from {A){B)IN 
with a standard error calculated from the latter value as a 
‘theoretical number,” for it is not a theoretical number given 
a prion as m the above illustrations, and (J.) and (B) are themselves 
liable to errors of sampling. If we formed an associatio n-table 
between the results of tossing two coins W times, cr= 
would be the standard error for the divergence of (.dj^T^rom the'" 
a prion value not the standard error for differences of (-4 5) 
from {A){B)J]Sf^ (.4) and {B ) being the numbers of heads thrown 
m the case of the first and the second coin respectively 

Case II — Two samples from distinct materials or different 
universes give propoitions of j4’s and numbers of 

observations m the samples being and 7 I 2 respectively (a) Can 
the difference between the two proportions have arisen merely as a 
fluctuation of simple sampling, the two universes being really 
similar as regards the proportion of A^^ therein? (h) If the 
diflTerence indicated were a real one, might it vanish, owing to 
fluctuations of sampling, in other samples taken m precisely the 
same way ? This case corresponds to the testing of an association 
which IS indicated by a comparison of the proportion of A^s amongst 
j5’s and /5’s 

{a) We have no theoretical expectation in this case as to the 
proportion of .d's m the universe from which either sample has 
been taken 

Let us find, however, whether the observed difference between p^ 
and JP 2 aiisen solely as a fluctuation of simple 

sampling, the proportion of A^a being really the same m both cases, 
and giYeUj let us say, by the (weighted) mean propoition m our 
tjvo samples together, i e, by 


Po 


^ 1+^2 


(the best^uide that we have) 

Let €2 be the standard errors m the two samples, then 

f&ffAt. v'*^ , 4=Poqolni, 4=Poqolfh 

If the samples are simple samples in the sense of the previous 
work, then the mean difference between p^ and jpj 1^® zero. 



XIU — SIMPLE SAMPLING OF ATTRIBUTES, 


269 


and the standard error of the difference ejj, the samples being 
independent, “will be given by 

4=Po<2'o(^+i) . . . . (5) 

If the observed difference is less than some three times it 
may have arisen as a fluctuation of simple sampling only 

(b) If, on the other hand, the proportions of ^’s are not the same 
m the material from which the two samples are drawn, but and 
^2 are the true values of the proportions, the standaid errors of 
sampling m the two cases are 


and consequently 




4=f32W«a 


2 Pxi\ 

€i2 — I 

% «2 


If the difference between and does not exceed some three ^ 
times this value of it may be obliterated by an error of simple j 
sampling on takmg fresh samples in the same way from the same [ 
material * 

Further, the student should note that the value of €^2 by 
equation (6) is frequently employed, in lieu of that given by 
equation (5), for testing the significance of an observed difference. 
The justification of this usage we indicate briefly latei (Chap. 
XIY, § 3) Here it is sufficient to state that, if <n be large, 
equation (6) gives approximately the standard-devlatidn^hf the 
true values of the difference for a given observed value, and hence, 
if the observed difference is greatei than some three times 
the value of by (6), it is hardly possible that the true 

value of the difference can be zero The difference between the 
values of given by (5) and (6) is indeed, as a rule, of more 
theoretical than practical moLpoitance, for they do not differ largely 
unless'^land differ laigelypandT in tharbase^ either formula will 
place the difference outside the range of fluctuations of sampling. 

Example iii — The following data were given in Qu 3 of Chap 
III for plants of Lobelia fulgens obtained by cross- and self-fertilisa- 
tion respectively: — 


Parentage Cross- fei til ised. 
Height — 

Above Aveiage Below Aveiage. 

}t 


Parentage Self- fertilised. 
Height — 

Above Average Below Aveiage 

12 


The figures indicate an association between tallne'^and cross- 
fertilisation of parentage Is this association significant of some 
real "difference, or may it have arisen solely as an “ error of 



270 


THEORY OF STATISTICS. 


sampling ” 1 The proportion of plants above average height in the 
two classes (cross- and self-fertilised) together is 29/68. The 
standard-deviation of the differences due to simple sampling 
between the proportions of “ tall ” plants in two samples of 34 
observations each is therefore 




/29 39 2V A100 


or 12 0 per cent The actual proportions observed are 50 per 
cent"*amil5 per cent. — difference 15 per cent. As this difference 
IS only slightly m excess of the standard error of the difference, 
for samples of 34 observations drawn from identical material, no 
definite significance could be attached to it — if it stood alone 
The student will notice, however, that all the other cases cited 
from Darwin in the question referred to show an association of 
the same sign, but rather more marked Hence the difference 
observed may be a real one, or perhaps the real difference may be 
greater and may be partially masked by a fluctuation of sampling 
If 50 per cent and 35 per cent were the true proportions m the 
two classes, the standard error of the percentage difference would 
be, by equation (6), 

^50 X 50 35 X 65N 






34 


34 




119 per cent , 


and consequently the actual diffeience might not infrequently be 
completely masked by fluctuations of sampling, so long as experi- 
ments were only conducted on the same small scale 

Example iv. — (Data from J Gray, Memoir on the Pigmentation 
Survey of Scotland, Jour of the Royal Anthropological Institute, 
vol. xxxvii., 1907 ) The following are extiacted from the tables 
relating to hair-colour of girls at Edinbuigh and Glasgow , — 


Edinburgh 

Glasgow 


Of Medium Total Per cent. 

Hair-colour observed Medium. 


4,008 9,743 41 1 

17,529 39,764 44 1 


Can the difference observed in the percentage of girls of medium 
bafiPcoIouf have arisen solely through fluctuations of sampling ?" 

fiTthe two town^ together the percentage of girls with medium 
hair-colour is 43 ^p Qr cent If this were the true percentage, 
the standard error oF* sampling for the difference between per 
centages observed in samples gf the above sizes would be — 

=>'■'(945+55^61)' 

« 0 56 per cent. 



XIII — SIMPLE SAMPLING OF ATTRIBUTES. 


2^1 


The actual difference is 3‘O.j aejr cent., or over 5 times this, and 
could not have arisen through the chances of simple sampling. 

If we assume that the difference is a real one and calculate the 
standard error by equation (6), we arrive at the same value, viz 
0 56 per cent With such large samples the difference could not, 
accordingly, be obliterated by the fluctuations of simple sampling 
alone. 

Case III. — Two samples are drawn from distinct material or 
different universes, as in the last case, giving proportions of 
A's and but in lieu of comparing the proportion with 
^2 it IS compared with the proportion of in the two samples 
together, viz Pq, where, as before. 


^0 = 




Required to find whether the difference between and can 
have arisen as a fluctuation of simple sampling, being the 
true proportion of A’s in both samples. 

This case corresponds to the testing of an association which 
IS indicated by a comparison of the propoition of jI’s amongst 
the B^s with the proportion of A^s in the universe The general 
treatment is similar to that of Case II , but the work is complicated 
owing to the fact that errors in pj^ and p^ are not independent. 

If^ be the standard error of the difference between and 
we nave at once 

, 4 l = 4 + 2^01 , €q€i 




n/tIj H- 1 


Vqj being the correlation between errors of simple sampling in ' 
Pj and Pq But, fiom the above equation relating to p^ 
and P 2 , writing it in terms of deviations in pj and pg, 
multiplying by the deviation m p^ and summing, we have, 
since errors in p^ and pg are uncorrelated, 


Therefore finally 


+ V ^ 






^1 + % % 


( 7 ) 


Unless the difference between p^ and p^ exceed, say, some 
three times this value of €qj, it may have arisen solely by the 
chances of simple sampling 



272 


THEOEY OF STATISTICS. 


It will be observed that if Wj be very small compared with 
7 ^ 2 , approaches, as it should, the standard error for a sample 
of 71^ observations. 

We omit, m this case, the allied problem whether, if the 
difference between and Pq indicated by the samples were 
real, it might be wiped out in other samples of the same size 
by fluctuations of simple sampling alone The solution is a 
little complex as we no longer have €o=^o2'o/(^i + ^ 2 ) 

Example v. — Taking the data of ^^ample iii., suppose that 
we compare the proportion of tall plants amongst the offspring 
resulting from cross-fertilisations (viz 50 per cent ) with the 
proportion amongst all offspring (viz. 29/68, or 42 6 per cent ). 
As, in this case, both the subsamples have the same number 
of observations, « Tig = 34, and 

- leS ^ 68 68/ ~ ® 

or 6 per cent As in the working of Example m , the observed dif- 
ference is only 1 25 times the standard error of the difference, and 
consequently it may have arisen as a mere fluctuation of sampling. 

Example vi. — Taking now the figures of Example iv , suppose 
that we had compared the proportion of girls of medium hair- 
colour in Edinburgh with the proportion in Glasgow and 
Edinburgh together. The former is 41 T per cent, the latter 
43*5 per cent , difference 2 4 per cent The standard error of 
the difference between the percentages observed in the sub- 
sample of 9743 observations and the entire sample of 49,507 
^observations is therefore 

e„i = (43"5 X 56 45 per cent 

The actual difference is over five times this (the ratio must, of 
course, be the same as in Example iv.), and could not have occurred 
as a mere error of sampling. 

REFERENCES. 

The theory of sampling, for the cases dealt with m this chapter, is generally 
treated by first determining the frequency-distribution of the number of 
successes in a sample This frequency-distribution is not considered till 
Chapter XV , and the student will be unable to follow much of the literature 
until he has read that chapter 

Experimental results of dice throwmg, coin tossing, etc. 

(1) Qttetelbt, a., Lettres , . . mr la th4one des prohahiht^s , Bruxelles, 

1846 (English translation by O (J. Downes ; C & E. Layton, London, 
1849) See especially letter xiv, and the table on p 374 of the 
French, p. 255 of the English, edition 



XIII — SIMPLK SAMPLING OF ATTRIBUTES. 273 

(2) Westergaaed, H., D%6 Grundzilge der TJieone der Statisttk; Fischer, 

Jena, 1890, 

(3) Edgeworth, F. Y , Article on the “ Law of Error” in the Tenth Edition 

of the ^ncyelopcedta Bntannicay vol. xxviu , 1902, p. 280 ; or on 
Probability,” Eleventh Edition, vol xxiL (especially Part II., 
pp. 390 et seq ) 

(4) Darbishire, a, D , Some Tables for illustrating Statistical Correlation,” 

Mem and Proc of the Manchester Lit and Phil, Soc , vol li., 1907. 

General : and applications to sex-ratio of hirths. 

(5) Poisson, S D., “ Sur la proportion des naissanees des filles et des 

gar 9 ons,” M^movres de VAc^ des Baiences^ vol. ix , 1829, p. 239 
(Principally theoretical the statistical illustrations very slight.) 

(6) Lexis, W , Zur Theoru der Massenerscheinung&n tn der Tnemchlichen 

Gesellsehaft , Freiburg, 1877. 

(7) Lexis, W , Abhandlungen sur Theorie der Bevolkerungs und Moralstati- 

stiJc y Fischer, Jena, 1903. (Contains, with new matter, reprints of 
some of Professor Lexis’ earlier papeis in a form convenient for 
reference ) 

(8) Edgeworth, F Y., ** Methods of Statistics,” Jour, Boy, Stat, Boc,, 

jubilee volume, 1885, p. 181. 

(9) Venn, John, The Logic of Charice^ 3rd edn , Macmillan, London, 1888 

{Of, the data regarding the distribution of sexes m families on p. 264, 
to which reference was made in § 9 ) 

(10) Pearson, Karl, ‘‘Skew Variation in Homogeneous Material,” Phil, 

Trans, Roy Boc , Senes A, vol. clxxxvi., 1895, p. 343. (Sections 2 to 
6 on the bmomial distnbution.) 

(11) Edgeworth, F Y., “Miscellaneous Applications of the Calculus of 

Probabilities,” Boy, Blat, Soc.,yo\s, lx.. In , 1897-8 (especially 
part u , vol. 1x1 p. 119) 

(12) Vigor, H. D , and G U. Yhle, “On the Sex-ratios of Births m the 

Registration Districts of England and Wales, 1881-90,” Jour Roy 
Btat Boc,^ vol. liix., 1906, p. 576. (Use of the harmonic mean as in 
§11 ) 

As regards the sex-ratio, reference may also be made to papers in 
vols V. and vi of BiometriJca by Heron, Weldon, and Woods. 

(13) Yule, G U , “Fluctuations of Sampling in Mendelian Ratios,” Proc 

Cam!) Phil, Boc , vol. xvii., 1914, p 425. 

The law of small chances (§ 12). 

(14) Poisson, S D , Recherches sur la prolahihU des jugements, etc : Pans, 

1837 (Pp. 205-7.) 

(15) Bortkewitsch, L. von, Das Qesetz der kleinen Zahlen ; Teubner, 

Leipzig, 1898. 

(16) Student, “On the Eiror of Counting with a Hsemacytometer,” Bio- 

metrika, voL v. p. 351, 1907 

(17) Rutherford, E , and H. Geiger, with a note by H. Bateman, 

“The probability variations in the distribution of a particles,” Phil, 
Mag , Senes 6, vol xx , 1910, p. 698 (The frequency of particles 
emitted during a small mterval of time follows the law of small 
chances the law deduced by Bateman in ignoiance of previous work ) 

(18) Soper, H. E , “Tables of Poisson’s Exponential Bmomial Limit,” Bio- 

metriha, vol x,, 1914, pp. 25-35 

(19) Whitaker, Lucy “ On Poisson’s Law of Small Numbers,” 

vol. X., 1914, pp. 36-71. 


18 



274 


THEOEY OF STATISTICS. 


EXERCISES. 


1. (Ref 4 : total of columns of all the 13 tables given ) 

Compare the actual with the theoretical mean and standard-deviation for 
the following record of 6500 throws of 12 dice, 4, 6, or 6 being reckoned 
as a “ success ” 


Successes Frequency 


0 

1 

2 

3 

4 

5 

6 


1 

14 

103 

302 

711 

1231 

1411 


Successes 

7 

8 
9 

10 

11 

12 


Frequency. 

1351 

S44 

391 

117 

21 

3 


Total 6500 


2. (Ref. 1 ) 

Balls were drawn from a bag containing equal numbers of black and white 
balls, each ball being returned before drawing another The records were then 
grouped by counting the number of black balls m consecutive 2*s, 3*s, 4’s, 5*s, 
etc. The following give the distributions so derived for grouping by 5’s, 6’s, 
and 7’s. Compare actual with theoretical means and standard-deviations 


Successes. 

(a) Giouping 
. by Fives 

(5) Grouping 
by Sixes. 

(c) Groupmg 
by Sevens 

0 

30 

17 

9 

1 

125 

65 

34 

2 

277 

166 

104 

3 

224 

192 

151 

4 

136 

166 

148 

5 

27 

69 

95 

6 

— 

8 

40 

7 

— 

— 

4 

Total 

819 

683 

585 


3 (Ref 2, p 22 ) 

Ten thousand drawings of a ball from a bag containing equal numbers of 
black and white were made in the same mannei as m the preceding example, 
and then grouped into 100 sets of 100 The following gives the resulting 
frequency of different numbers of white balls Compare mean and standard 
deviation with theory. 


Number. Frequency 


34 1 

35 — 

36 — 

37 — 

38 — 

39 1 

40 2 

41 2 

42 2 

43 3 


Number. 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 


Fiequency 

3 

4 

5 

6 
5 

11 

9 

5 

10 

4 


Number Frequency 


54 8 

55 3 

56 5 

57 4 

58 4 

59 — 

60 — 

61 1 

62 1 

63 1 




XIII — SIMPLE SAMPLING OF ATTRIBUTES. 


275 


4. The proportion of successes in the data of Qn 1 is 0*5097 Find the stand- 
ard-deviation of the propoition with the given number of throws, and state 
whether you would regard the excess of successes as probably significant of bias 
in the dice. 

5 In the 4096 drawings on which Qu. 2 is based 2030 balls were black 
and 2066 white Is this divergence probably significant of bias ? 

6. If a frequency-distribution such as those of Questions 1, 2, and 3 be given, 
show how 71 and _p, if unknown, may be approximately determined from the 
mean and standaid-deviation of the distribution 

Fmd 71 and ^ m this way from the data of Qu 1 and Qu 3. 

7. Verify the following results for Table VI of Chaptei IX. p 163, and 
compare the results of the difleient giouping of the table on p. 263 In 
calculating the actual standard-deviation, use Sheppard’s correction for 
grouping (p 212) 


Row or Rows 

Mean. 

Actual 
Standard- 
deviation s. 

Standard- 
deviation * 
of Samplmg Sq 

1 

508 2 

11 60 

11 18 

2 

509 5 

6 79 

6 45 

3 


5 28 

5 00 

4 

511 1 

5 03 

1 4 22 

5 

510 2 

3 67 

3 73 

6,7 


4*13 

3 24 


508 7 

3 10 

2 69 

12, 13, 14 

508*4 

2*55 

2 25 

15 and upwards 

508 2 

213 

1 85 


8. In a case of mice-breedmg (see reference given m § 11) the harmonic 
mean number in a litter was 4 735, and the expected proportion of albinos 
50 per cent Find the standard-deviation of simple sampling for the pro 
portion of albinos in a litter, and state whether the actual standard-deviation 
(21 63 per cent.) probably indicates any real variation, or not 

9. (Data from Report i , Evolution Committee of the Royal Society, p 17 ) 
In breeding certam stocks 408 hairy and 126 glabrous plants were obtained 
If the expectation is one-fourth glabrous, is the diveigence significant, or might 
it have occurred as a fluctuation of sampling ? 

10 (Data of Example ix and Qu 5, Chap III.) Is the association in 
either of the following cases hkely to have arisen as a fluctuation of simple 
sampling ? 

{a) {AB)=^^7 = n {aB) = 21 {a$) = B 

{b) (AB) = B09 (^j8) = 214 (aR) = 132 (a^) = 119 

11 The sex- ratio at birth is sometimes given by the ratio of male to female 
births, instead of the pioportion of male to total births If Z is the ratio, ^ e 

Z’^plq^ show that the standard error of Z is approximately (1+-^^/-, 

71 being large, so that deviations are small compared with the mean (The 
student may find it useful to lefer to § 8, Chap. XI ) 

* Based on the mid-value of the class-interval for single rows, or the 
harmonic mean of the mid- values for gioups of lows 




CHAPTER XIV 

SIMPLE SAMPLINa CONTINUED: EFFECT OP 
REMOVING THE LIMITATIONS OF SIMPLE SAMPLING. 

1. Waining as to the assumption that three times the standard error gives the 
range for the majority of fluctuations of simple sampling of either sign 
—2. Warmng as to the use of the observed for the true value of in 
the formula for the standard error — 3. The m verse standard error, or 
standard error of the true proportion for a given observed proportion . 
equivalence of the direct and inverse standard errors when n is large— 
4-8. The importance of errors other than fluctuations of “simple 
sampling” in practice* unrepresentative or biassed samples — 9-10 
Effect of divergences from the conditions of simple sampling (a) 
efiect of vanation mjp and q for the several universes from which the 
samples are drawn — 11-12. (h) Effect of vanation in jp and q from one 
sub-class to another withm each universe— 13-14. (c) Effect of a 
correlation between the results of the several events — 15. Summary 

1. There are two warnings as regards the methods adopted in 
the examples in the concluding section of the last chapter 
which the student should note, as they may become of importance 
when the number of observations is small In the first place, he 
should remember that, while we have taken three times the 
standard error as giving the limits within which the great 
majority of errors of sampling of either sign are contained, 
the limits are not, as a rule, strictly the same for positive and 
for negative errors. As is evident from the examples of actual 
distributions in § 7, Chap XIII., the distribution of errors is not 
strictly symmetrical unless jp = g' = 0 5. No theoretical rule as 
to the limits can be given, but it appears from the examples 
referred to and from the calculated distributions in Chap. XY. 
§ 3, that a range of three times the standard error includes 
the great majority of the deviations m the direction of the 
longer “tail” of the distribution, while the same range on the 
shortei side may extend beyond the limits of the distribution 
altogether. If, therefore, p be less than 0 5, our assumed range 
may be greater than is possible for negative errors, or if p be 

276 



XIT.— REMOVING LIMITATIONS OR SIMPLE SAMPLING. 277 


greater than 0 5, greater than is possible for positive errors. The 
assumption is not, however, likely as a rule to lead to a serious 
mistake , as stated at the commencement of this paragraph, the 
point is of importance only when n is small, foi when n is large the 
distribution tends to become sensibly symmetrical even for values 
of p differing considerably from 0 5 {Cf. Chap, XY. for the 
properties of the limiting form of distribution ) 

2 In the second place, the student should note that, where we 
were unable to assign any a priori value to p, we have assumed 
that it is sufficiently accurate to replace p m the formula for the 
standard error by the proportion actually observed, say tt. 
Where n is large so that the standard error of p becomes small 
relatively to the product pq the assumption is justifiable, and no 
serious error is possible. If, however, n be small, the use of the 
observed value tt may lead to an under- or over-estimation of the 
standard error which cannot be neglected To get some rough 
idea of the possible importance of such effects, the approximate 
standard error € may first be calculated as usual from the 
observed proportion v, and then fresh values recalculated, replac- 
ing V by TT ± 3€ It should be remembered that the maximum 
value of the product pq is given byp = g' = 0*5, and hence these 
values, if within the limits of fluctuations of sampling, will give 
one limiting value for the standard error The procedure is by 
no means exact, but may serve to give a useful warning 

Thus in Example iii of Chap XIII. the observed proportion of 
tall plants is 29/68, or, say, 43 per cent. The standard error of 
this proportion is 6 per cent , and a true proportion of 50 per 
cent IS therefore well within the limits of fluctuations of sampling 
The maximum value of the standard error is therefore 

/50x50V 

\ — 6^ I = 6 06 per cent. 

On the other hand, the standard error is unlikely to be lower 
than that based on a proportion of 43 — 1 8 = 25 per cent , 

/25 X 75V K 

\ — 68 — / cent. 

3. The two difficulties mentioned m §§ 1 and 2 arise when n, 
the number of cases in the sample, is small The interpretation 
of the value of the standard error is also more limited in this 
case than when n is large. Suppose a large number of observa- 
tions to be made, by means of samples of n observations each, on 
different masses of material, or m different umveises, for each of 
which the true value of p is known. On these data we could 



2l8 


THEOBY OF STATISTICS. 


form a correlation -table between the true proportion p in b. given 
universe and the observed proportion -n- in a sample of n observa- 
tions drawn therefrom What we have found from the work of 
the last chapter is that the standard-deviation of an array of tt’s 
associated with a certain true value p, in this table, is (pqjn)^ , 
but the question may be asked —What is the standard-deviation 
of the array at right angles to this, ^ e the an ay of ^^s associated 
with a certain observed proportion ttIi In other words, given an 
observed proportion tt, what is the standard-deviation of the true 
proportions'? This is the inverse of the problem with which we 
have been dealing, and it is a much more difficult problem 
On general principles, however, we can see that if n be large, 
the two standard-deviations will tend, on the average of all 
values of to be nearly the same, while if n be small the standard- 
deviation of the array of tt’s will tend to be appreciably the 
greater of the two For if ir—p-k^Z, 8 is uncorrelated with Pf 
and therefore if Cp be the standard-deviation of p m all the 
universes from which samples are drawn, cr^r the standard- 
deviation of observed proportions in the samples, and as the 
standard-deviation of the differences, 

But cr| varies inversely as n. Hence if n become very large, as 
becomes very small, ajr becomes sensibly equal to cr^, and therefore 
the standard-deviations of the arrays, on an average, are also 
sensibly equal If be large, therefoie, [7r(l - may be 

taken as giving, with sufficient exactness, the standard-deviation 
of the true propoition^ for a given observed proportion m But 
if n be small, crs cannot be neglected m comparison with cr^ a^r is 
therefore appreciably greater than oTp, and the standard-deviation 
of the array of tt^s is, on an average of all arrays, correspondingly 
greater than the standard deviation of the array of ^'s — the state- 
ment is not true for every pair of corresponding arrays, especially 
for extreme values of p near 0 and 1. Fuither, it should be 
noticed that, while the regression of tt on p is unity — i e. the 
mean of the array of tt’s is identical with p, the type of the 
array — ^the regression of p on tt is less than unity. If we as- 
sume, therefore, that a tabulation of all possible chances, observed 
for every conceivable subject, would give a distribution of p 
ranging umformly between 0 and 1, or indeed grouped symmetri- 
cally in any way round 0 5, any observed value v greater than 
0 5 will probably correspond to a true value of p slightly lower 
than TT, and conversely We have already referred to the use of 
the inverse standard enor m § 13 of Chap XIIL (Case II , p 269). 
If we determine, for example, the standard error of the difference 



XIV , — REMOVING LIMITATIONS OF SIMPLE SAMPLING. 279 


between two observed proportions by equation (6) of that chapter, 
this may be taken, provided n be large, as approximately the 
standard-deviation of true differences for the given observed 
difference. 

4. The use of standard errors must be exercised with care It 
IS very necessary to remember the limited assumptions on which 
the theory of simple sampling is based, and to bear in mind that 
it covers those fluctuations alone which exist when all the assumed 
conditions are fulfilled The formulae obtained for the standard 
errors of proportions and of their differences have no bearing 
except on the one question, whether an observed divergence of a 
certain proportion from a certain other proportion that might be 
observed in a more extended series of observations, or that has 
actually been observed in some other series, rmght or might not 
be due to fluctuations of simple sampling alone. Their use is 
thus quite restricted, for in many cases of practical sampling this 
IS not the principal question at issue. The principal question in 
many such cases concerns quite a different point, viz whether the 
observed proportion tt in the sample may not diverge from the 
proportion p existing in the universe from which it was drawn, 
owing to the nature of the conditions under which the sample was 
taken, tt tending to be definitely greater or definitely less than 
p. Such divergence between tt and might arise in two distinct 
ways, (1) owing to variations of classification in sorting the 
^*s and a’s, the characters not being well defined — a source of 
error which we need not further discuss, but one which may lead 
to serious results [cf ref. 5 of Chap. V.] (2) Owing to either -4's 

or a’s tending to escape the attentions of the sampler To give 
an illustration from artificial chance, if on drawing samples from 
a bag containing a very large number of black and white balls 
the observed proportion of black balls was tt, we could not 
necessarily infer that the proportion of black balls in the bag was 
approximately tt, even though the standard error were small, and 
we knew that the proportions in successive samples were subject 
to the law of simple sampling. For the black balls might be, 
say, much more highly polished than the white ones, so as to 
tend to escape the fingers of the sampler, or they might be re- 
presented by a number of lively black insects sheltering amongst 
white stones: in neither case would the ratio of black balls to 
white, or of insects to stones, be represented m their proper pro- 
portions. Clearly, in any parallel case, inferences as to the 
material from which the sample is drawn are of a very doubtful 
and uncertain kind, and it is this uncertainty whether the chance 
of inclusion in the sample is the same for J.^s and a’s, far more 
than the mere divergences between different samples drawn in 



280 


tSEORY OF StAtlSTlCS. 


the same way, which renders many statistical results based on 
samples so dubious 

5. Thus in collecting returns as to family income and expendi- 
ture from working-class households, the families with lower 
incomes are almost certain to be under-represented ; they largely 
“escape the sampler’s fingers” from their simple lack of ability 
to keep the necessary accounts It is almost impossible to say, 
however, to what extent they are under-represented, or to form 
any estimate as to the possible error when two such samples 
taken by different persons at different times, or in different places, 
are compared. Again, if estimates as to crop-production are 
formed on the basis of a limited number of voluntary returns, 
the estimates are likely to err in excess, as the persons who 
make the returns will probably include an undue proportion 
of the more intelligent farmers whose crops will tend to be 
above average. Whilst voluntary returns are in this way liable 
to lead to more or less unrepresentative samples, compulsory 
sampling does not evade the difficulty. Compulsion could not en- 
sure equally accurate and trustworthy returns from illiterate 
and well-educated workmen, from intelligent and unintelligent 
farmers The following of some definite rule in drawing the 
sample may also produce unrepresentative samples if samples 
of fruit were taken solely from the top layers of baskets exposed 
for sale, the results might be unduly favourable ; if from the 
bottom layer, unduly unfavourable 

6 In such cases we can see that any sample, taken in the 
way supposed, is likely to be definitely Massed^ m the sense 
that it will not tend to include, even in the long run, equal 
proportions of the A’s and a’s in the original material In other 
cases there may be no obvious reason for presuming such him, 
but, on the other hand, no certainty that it does not exist Thus 
if we noted the hair-colours of the children in, say, one 
school in ten in a large town, the question would arise whether 
this method would tend to give an unbiassed sample of all the 
children, No assured answer could be given: conjectures on 
the matter would be based in part on the way in which the 
schools were selected, e g the volunteering of teachers for the work 
might in itself introduce an element of bias Again, if say 
10,000 herrings were measured as landed at various North Sea 
ports, and the question were raised whether the sample was 
hkely to be an unbiassed sample of North Sea herrings, no 
assured answer could be given There may be no definite reason 
for expecting definite bias in either case, but it may exist, and 
no mere examination of the sample itself can give any informa- 
tion as to whether it exists or no. 



XIT.— REMOVING LIMITATIONS OF SIMPLE SAMPLING. 281 


7. Such an examination may be of service, however, as 
indicating one possible source of bias, viz great heterogeneity in 
the original material If, for example, m the first illustration, 
the hair-colours of the children differed largely m the different 
schools — much more largely than would be accounted for by 
fluctuations of simple sampling — it would be obvious that one 
school would tend to give an unrepresentative sample, and 
questionable therefore whether the five, ten or fifteen schools 
observed might not also have given an unrepresentative sample 
Similarly, if the herrings in different catches varied largely, it 
would, again, be difficult to get a representative sample for a 
large area But while the dissimilarity of subsamples would 
then be evidence as to the difficulty of obtaining a representative 
sample, the similarity of subsamples would, of course, be no 
evidence that the sample was representative, for some very 
different material which should have been represented might 
have been missed or overlooked 

8. The student must therefore be very careful to remember 
that even if some observed difference exceed the limits of fluctua- 
tion in simple sampling, it does not follow that it exceeds the 
limits of fluctuation due to what the practical man would regard — 
and quite rightly regard — as the chances of sampling Fm*ther, 
he must remember that if the standard error be small, it by no 
means follows that the result is necessarily trustworthy the 
smallness of the standard error only indicates that it is not 
untrustworthy owing to the magnitude of fluctuations of simple 
sampling. It may be quite untrustworthy for other reasons . 
owing to bias m taking the sample, for instance, or owing to definite 
errors m classifying the A’b and a’s On the other hand, of course, 
it should also be borne in mind that an observed proportion is not 
necessarily incorrect, but merely to a greater or less extent 
untrustworthy if the standard error be large Similarly, if an 
observed proportion in a sample drawn from one umverse be 
greater than an observed proportion ttj in a sample drawn from 
another universe, but tt^ - is considerably less than three times 
the standard error of the difference, it does not, of course, follow 
that the true proportion for the given universes, and jOg, are 
most probably equal On the contrary, p^ most likely exceeds pg 1 
the standard error only warns us that this conclusion is more or 
less uncertain, and that possibly p^ may even exceed p-^ 

9 Let us now consider the effect, on the standard-deviation of 
sampling, of divergences from the conditions of simple sampling 
which were laid down in § 8 of Chap XIIL 

First suppose the condition (a) to break down, so that there is 
some essential difference between the localities from which, or the 



282 


THEOEY OF STATISTICS. 


conditions under which, samples are drawn, or that some essential 
change has taken place during the period of sampling. We may 
represent such circumstances in a case of artificial chance by 
supposing that for the first throws of n dice the chance of 
success for each die is for the next/g throws next/g 

throws jpg, and so on, the chance of success varying from time to 
time, ]ust as the chance of death, even for individuals of the same 
age and sex, varies from district to district Suppose, now, that 
the records of all these throws are pooled together The mean 
number of successes per throw of the n dice is given by 


-^=^(/li’l+/2P2+/8P8+ ■ • • 

where N— 2(/) is the whole number of throws and p^ is the mean 
value '2{fp)IJ^ of the varying chance p. To find the standard- 
deviation of the number of successes at each throw consider that 
the first set of throws contributes to the sum of the squares of 
deviations an amount 


being the square of the standard-deviation for these throws, 
and n(j)^ the difference between the mean number of 
successes for the first set and the mean for all the sets together 
Hence the standard-deviation o- of the whole distribution is given 
by the sum of all quantities like the above, or 


Fcfi = n2,(fpq) + rfi l.fij) -pi^y 

Let <Tp be the standard-deviation of jt?, then the last sum is 
and substituting 1 -- for q, we have 

= npQ - npl - no^p 4 - 

= 71^o2'o + ^('^- IVp . . . • (1) 


This IS the formula corresponding to equation (1) of Chap. 
XIIL , if we deal with the standard-deviation of the proportion 
of successes, instead of that of the absolute number, we have, 
dividing through by the formula corresponding to equation 
(2) of Chap. XIII., VIZ.— 




( 2 ) 


10. If 71 be large and Sq be the standard-deviation calculated 
from the mean proportion of successes p^, equation (2) is sensibly 
of the form 





XIV. — KEMOVING LIMITATIONS OF SIMPLE SAMPLING 283 


Table showing 'Freguencies of Registration Districts tn England and Wales 
with Different Proportions of Deaths in Childbirth {including Deaths 
from Puerperal Fever) per 1000 Births in the same Year^ for the same 
Groups of Districts as in the Table of Chap. XIII. § 10 Data from same 
source Decade 1881-90. 





Number of Births in the Decade 


jL/eatns m 
Childbirth, per 
1000 Births 

1500 

to 

2500 

3500 

to 

4000 

4500 

to 

5000 

10,000 

to 

15,000 

15.000 
to 

20.000 

30,000 

to 

50,000. 

60,000 

to 

90,000 

15-20 




2 





2 0-25 


1 

— 

1 

1 



— 

— 

2 5-30 


1 

3 

1 

— 

— 

— 

— 

3 0- 3'6 


1 

5 

2 

4 



1 

2 

3*5- 4 0 


5 

6 

5 

8 

5 

5 

9 

4 0-45 


6 

5 

8 

23 

4 

9 

6 

4 5-50 


2 

5 

9 

14 

11 

7 

6 

5 0- 6*5 


7 

3 

6 

14 

6 

8 

7 

5 5-60 


5 

3 

4 

5 

2 

5 

4 

S 0- 6 5 


1 

5 

1 

— 

4 

1 

1 

6'5- 7*0 


3 

1 

1 

3 

— 

2 

1 

7 0-75 


1 

1 


— 

~ 

4 

— 

7 5-80 


— 

— 


— 

— 

1 



8 0- 8*5 




— 











8 5-90 


1 

1 


— 

1 

— 

— 

9 0-95 



__ 







9 5-10 0 


1 

— 

— 

1 





— * 

10 0-10 5 


— 

— i 









— 

10 5-11 0 


1 

— 

— 

— 

— 

— 

- 

Total 


36 

38 

40 

73 

33 

43 

35 

Mean 


5 29 

4 71 

4 45 

4 68 

4 99 

1 6 13 

4 64 

Standard - de - 1 
viation 1 


1 77 

1 37 

1 09 

1 01 

0 99 

1T2 

0 87 

Theoretical 
standard -de- 









viation corre- 


1 62 

1*12 

0 97 

0 61 

0 53 

0 36 

0 26 

sponding to 
mean births J 











0 71 

0 80 

0 61 

0 80 

0 84 

1*07 

0 83 


and hence, knowing s and Sq, we can find o-p the standard-deviation 
of the chance or proportion m the universes from which the 
samples have been drawn. 

The values of - sj are tabulated at the foot of the table 
showing the distribution of the proportion of male births in 




284 


THBOKY OF STATISTICS. 


certain registration distiicts of England, m § 10 of Chap XIII. 
p 263 It will be seen that m the first group of small istriots 
there appears to be a significant standard-deviation of some 6 
units in the proportion of male births per thousand, but in the 
more urban districts this falls to 1 or 2 units ^ in one case only 
does s fall short of Sq. In the table on p 283 are given some 
different data relating to the deaths of women m childbirth in the 
same groups of districts, and in this case the effect of definite 
causes is relatively larger, as one might expect The values of 
- sj suggest an almost uniform significant standard-deviation 
cTp — O'S m the deaths of women per thousand births, five out of 
the eight values being very close to this average The figures of 
this case also bring out clearly one important consequence of (2), 
VIZ that if we make n large s becomes sensibly equal to o-^, while 
if we make n small s becomes more nearly equal to Hence 

if we want to know the significant standard-deviation of the pro- 
portion p — the measure of its fluctuation owing to definite causes 
— n should be made as large as possible , if, on the other hand, we 
want to obtain good illustrations of the theory of simple sampling 
n should be made small. If n be very large the actual standard- 
deviation may evidently become almost indefinitely large com- 
pared with the standard-deviation of sampling. Thus during the 
20 years 1855-74 the death-rate in England and Wales fluctuated 
round a mean value of 22 2 per thousand with a standard-devia- 
tion of 0 86. Taking the mean population as roughly 21 millions, 
the standard-deviation of sampling is appioximately 



22 X 978 

21 X 106 


0*032. 


This IS only about one twenty-seventh of the actual value. 

11. How consider the effect of altering the second condition 
of simple sampling, given in § 8 (6) of Chapter XIII , viz. the 
condition that the chances p and q shall be the same for every 
die or com in the set, or the circumstances that regulate the 
appearance of the character observed the same for every individual 
or every sub-class m each of the universes from which samples 
are drawn Suppose that m the group of n dice thrown the 
chances for % dice are p-^ , for dice, q^^ and so on, 
the chances varying for different dice, but being constant 
throughout the experiment The case differs from the last, as 
in that the chances were the same for every die, at any one 
throw, but varied from one throw to another : now they are con- 
stant from throw to throw, but differ from one die to another as 
they would m any ordinary set of badly made dice Required to 
find the effect of these differing chances. 



XIV.-— REMOVING LIMITATIONS OF SIMPLE SAMPLING. 285 


For the mean number of successes we evidently have 


+ + .... 

=^np^ 

being the mean chance 'Z(mp)ln. To find the standard-deviation 
of the number of successes at each throw, it should be noted that 
this may be regarded as made up of the number of successes in 
the dice for which the chances aie p^ together with the 
number of successes amongst the dice for which the chances 
are these numbers of successes are all 

independent. Hence 

Substituting 1-p for q, as before, and using cr^ to denote the 
standard-deviation of jo, 

cf^^np^qQ-na-l ... (3) 

or if s be, as before, the standard-deviation of the propo'i iion of 
successes, 

o_iJo2'o ^ 

71 n 

12. The effect of the chances varying for the individual dice or 
other “events” is therefore to lower the standard-deviation, as 
calculated from the mean proportion and the effect may 
conceivably be considerable. To take a limiting case, if p be zero 
for half the events and unity for the remainder, joq = and 

cTp = so that 3 is zero. To take another illustration, still some- 
what extreme, if the values of p are uniformly distributed over 
the whole range between 0 and 1, ^ before but ar| = 

1/12-0 08^ (Chap VIII. § 12, p. 143). Hence s2==o 1667/7i, 
5 = 0 408 /as/^, instead of 0 5 /as/w, the value of s if the chances are 
^ in every case In most practical cases, however, the effect will be 
much less Thus the standard-deviation of sampling for a death- 
rate of, say, 1 8 per thousand m a population of uniform age and 
one sex is (18 x 982)Y\/^= 133/\/w In a population of the age 
composition of that of England and Wales, however, the death- 
rate IS not, of course, uniform, but varies from a high value in 
infancy (say 150 per thousand), through very low values (2 to 4 
per thousand) in childhood to continuously increasing values in 
old age ; the standard-deviation of the rate within such a popula- 
tion IS roughly about 30 per thousand But the effect of this 



286 


THEORY OF STATISTICS. 


vaiiation on the standard-deviation of simple sampling is quite 
small, for, as calculated fiom equation (4), 

s2 = i(18x982-900) 
as compared with 133/V^ 

13. We have finally to pass to the third condition (c) of § 8, Chap 
XIII., and to discuss the effect of a certain amount of dependence 
between the several “ events ” in each sample. We shall suppose, 
however, that the two other conditions (a) and (b) are fulfilled, 
the chances p and q being the same for every event at every trial, 
and constant throughout the experiment. The problem is again 
most simply treated on the lines of § 5 of the last chapter. The 
standard-deviation for each event is (pq)^ as before, but the events 
are no longer independent, instead, therefore, of the simple 
expression 

we must have (c/. Chap. XI § 2) 

a^ = npq + 2pq(r^^ + r^s+ • • • • »' 23 + ••••). 

where, r^g, etc. are the correlations between the results of the 
first and second, first and third events, and so on — correlations 
for variables (number of successes) which can only take the 
values 0 and 1, but may nevertheless, of course, be treated as 
ordinary variables (cf Chap XL § 10) There are n(7i-l)/2 
correlation-coefiScientSj and if, therefore, r is the arithmetic mean 
of the correlations we may write 

= npq\\-\-r{n-\)'\, . . . (5) 

The standard-deviation of simple sampling will therefore be 
increased or diminished according as the average correlation 
between the results of the single events is positive or negative, 
and the effect may be considerable, as o- may be reduced to zero 
or increased to n(pqy. For the standard deviation of the propor- 
tion of successes in each sample we have the equation 

• • • • ( 6 ) 

It should be noted that, as the means and standard-deviations 
for our variables are all identical, r is the correlation-coefficient 
for a table formed by taking all possible pairs of results in the 
n events of each sample. 



XIV — REMOVING LIMITATIONS OF SIMPLE SAMPLING. 287 


It should also be noted that the case when r is positive covers 
the departure from the rules of simple sampling discussed in 
§§ 9-10 for if we draw successive samples from different records, 
this introduces the positive correlation at once, even although the 
results of the events at each trial are quite independent of one 
another Similarly, the case discussed in §§ 11-12 is covered by 
the case when r is negative . for if the chances are not the same 
for every event at each trial, and the chance of success for some 
one event is above the average, the mean chance of success for the 
lemamder must be below it The cases (a), (5) and (c) are, how- 
ever, best kept distinct, since a positive or negative correlation 
may arise for reasons quite different from those discussed in 
§§ 9-12 

14 As a simple illustration, consider the important case of 
sampling from a limited universe, eg, of drawing n balls in 
succession from the whole number in a bag containing white 
balls and qw black balls. On repeating such drawings a large 
number of times, we are evidently equally hkely to get a white 
ball or a black ball for the first, second, or Tzth ball of the sample : 
the con elation-table formed from all possible pairs of every sample 
will therefore tend m the long run to give just the same form of 
distribution as the correlation-table formed from all possible pairs 
of the w balls in the bag But from Chap XI. § 11 we 
know that the correlation-coefficient for this table is - 1), 

whence 

w — n 

If 71 = 1, we have the obviously correct result that (r = (pq)\ as 
in drawing from unlimited material * if, on the other hand, n — w, 
or becomes zero as it should, and the formula is thus checked for 
simple cases. For drawing 2 balls out of 4, or becomes 0 816 
{npqf j for drawing 5 balls out of 10, 0*745 {npq)^ • in the case 
of drawing haK the balls out of a very large number, it approxi- 
mates to (0 5 npq)^, or 0 707 (npqY 

In the case of contagious or infectious diseases, or of certain 
forms of accident that are apt, if fatal at all, to result in whole- 
sale deaths, r is positive, and if n be large (as it usually is in such 
cases) a very small value of r may easily lead to a very great increase 
in the observed standard-deviation. It is difficult to give a really 
good example from actual statistics, as the conditions are hardly 
ever constant from one year to another, but the following will 



288 


THEORY OF STATISTICS. 


serve to illustrate the point. During the twenty years 1887-1906 
there were 2107 deaths from explosions of firedamp or coal-dust 
in the coal-mines of the United Kingdom, or an average of 105 
deaths per annum. From § 12 of Chap XIII. it follows that this 
should be the square of the standard-deviation of simple sampling, 
or the standard-deviation itself approximately 10 3 But the 
square of the actual standard-deviation is 7178, or its value 84 7, 
the numbers of deaths ranging between 14 (in 1903) and 317 
(in 1894). This large standard-deviation, to judge from the 
figures, IS partly, though not wholly, due to a general tendency to 
decrease in the numbers of deaths from explosions in spite of a 
large increase in the number of persons employed ; but even if we 
Ignore this, the magnitude of the standard-deviation can be 
accounted for by a very small value of the correlation r, expressive 
of the fact that if an explosion is sufficiently serious to be fatal to 
one individual, it will probably be fatal to others also For if o-q 
denote the standard-deviation of simple sampling, cr the standard- 
deviation of sampling given by equation (5), we have 


Whence, from the above data, taking the numbers of persons 
employed underground at a rough aveiage of 560,000, 


7073 

^ 560000 X 105 


+ 0 00012 . 


15. Summarising the preceding paragraphs, §§ 9-14, we see 
that if the chances p and q differ for the various universes, 
districts, years, materials, or whatever they may be from which 
the samples are drawn, the standard-deviation observed will be 
greater than the standard-deviation of simple sampling, as 
calculated from, the average values of the chances * if the average 
chances are the same for each universe from which a sample is 
drawn, but vary from individual to individual or from one sub- 
class to another within the um verse, the standard-deviation 
observed will be less than the standard-deviation of simple 
sampling as calculated from the mean values of the chances . 
finally, if p and q are constant, but the events are no longer 
mdependent, the observed standard-deviation will be greater or 
less than the simplest theoretical value according as the corre- 
lation between the results of the single events is positive or 
negative. These conclusions further emphasise the need for 
caution in the use of standard errors. If we find that the 



XIV. — EEMOVING LIMITATIONS OF SIMPLE SAMPLING. 289 


standard-deviation in some case of sampling exceeds the standard- 
deviation of simple sampling, two interpretations are possible 
evther that p and q are different in the various universes from 
which samples have been drawn (le. that the variations are 
more or less definitely significant in ’the sense of § 13, Chap XIII ), 
or that the results of the events are positively correlated tnter 
se. If the actual standard-deviation fall short of the standard- 
deviation of simple sampling two interpretations are again 
possible, either that the chances p and q vary for different 
individuals or subclasses in each universe, while approximately 
constant from one universe to another, or that the results of 
the events are negatively correlated inter se Even if the 
actual standard-deviation approaches closely to the standard- 
deviation of simple samphng, it is only a conjectural and not 
a necessary inference that all the conditions of simple sampling ” 
as defined in § 8 of the last chapter are fulfilled Possibly, for 
example, there may be a positive correlation r between the 
results of the different events, masked by a variation of the 
chances p and q in sub-classes of each universe 

Sampling which fulfils the conditions laid down in § 8 of 
Chap XII L, simple sampling as we have called it, is geneially 
spoken of as random sampling We haye thought it better to 
avoid this term, as the condition that the sampling shall be 
random — ^haphazard — is not the only condition tacitly assumed 


REFERENCES. 

Gf generally the references to Chap XIIL, to which may be 
added — 

(1) Peaeson, Karl, “ On certain Properties of the Hypergeometrical Senes, 

and on the fitting of such Senes to Observation Polygons m the Theory of 
Chance,” Philosophical Magazine^ 5th Series, vol xlvu , 1899, p 236. 
(An expansion of one section of ref 10 of Chap XIII , dealing with the 
first problem of onr § 14, ^ e drawing samples from a hag containing 
a limited number of white and black balls, from the standpomt of the 
frequency-distribution of the number of white or black balls m the 
samples.) 

(2) Greenwood, M , **On Errors of Random Sampling in certain Cases not 

smtable for the Application of a ‘Noimal Curve of Frequency,’ BiO' 
mctnTca, vol ix , 1913, pp. 69-90. (If an event has succeeded p times in 
% trials, what are the chances of 0, 1, . . ,m successes in m subsequent 
trials ? Tables for small samples ) 

EXERCISES 

1 Referring to Question 7 of Chap XIII., work out the values of the 
significant standard deviation <rp (as in § 10) for each row or group of rows 
theie given, but taking row 5 with rows 6 and 7, 


19 



290 


THEORY OR STATISTICS. 


2. For all tlxe districts in England and Wales included in the same table 
(Table VI., Chap. IX ) the standard-deviation of the proportion of male birtho 
per 1000 of all births is 7*46 and the mean proportion of male births 509 2 
The harmonic mean number of biiths m a distiict is 5070, Find the signi- 
ficant standard deviation a-p, 

3 If for one half of n events the chance of success is ^ and the chance of 
failure whilst for the other half the chance of success is q and the chance of 
failure p, what is the standard-deviation of the number of successes, the events 
being all independent ^ 

4 The following are the deaths from small-pox during the 20 years 
1882-1901 m England and Wales 


1882 

1317 

1892 

431 

83 

957 

93 

1457 

84 

2234 

94 

820 

85 

2827 

95 

223 

86 

276 

96 

541 

87 

506 

97 

25 

88 

1026 

98 

253 

89 

23 

99 

174 

90 

16 

1900 

85 

91 

49 

1901 

356 


The death-rate from small-pox being very small, the rule of § 12, Chap. 
XIII , may be applied to estimate the standard deviation of simple sampling. 
Assuming that the excess of the actual standaid-deviation over this can be 
entirely accounted for by a correlation between the results of exposure to risk 
of the individuals composing the population, estimate r The mean population 
durmg the period may be taken m round numbers as 29 millions. 



CHAPTER XV. 

THE BINOMIAL DISTRIBUTION AND THE 
NORMAL CURVE. 

1-2 Determination of the frequency-distribution for the number of successes 
in % events the binomial distribution — 3 Dependence of the form 
of the distribution on p, q and n — 4-5 Graphical and mechanical 
methods of forming representations of the binomial distribution — 
6 Direct calculation of the mean and the standard-deviation from 
the distribution— 7-8. Necessity of deducing, for use in many 
practical cases, a continuous cuive giving appioximately, for large 
values of ti, the terms of the binomial senes— 9 Deduction of the 
normal curve as a limit to the symmetrical binomial — 10-11 The 
value of the cential ordinate — 12 Comparison with a binomial dis- 
tribution for a moderate value of n — 13. Outline of the moie general 
conditions from which the curve can be deduced by advanced methods — 
14, Fittmg the curve to an actual series of obseivations — 15. Difficulty 
of a complete test of ht by elementary methods— 16 The table of aieas 
of the normal cuive and its use — 17 The quartile deviation and tlie 
“probable eiror” — 18 Illustrations of the application of the normal 
curve and of the table of aieaa 

1. In Chapters XIII and XIV the standard-deviation of tho 
number of successes in n events was determined for the seveial 
more important cases, and the applications of the results indicated 
For the simpler cases of artificial chance it is possible, however, to 
go much further, and determine not merely the standard-deviation 
but the entire frequency-distribution of the number of “ successes ” 
This we propose to do tor the case of “simple sampling,” in which 
all the events are completely independent, and the chances p and 
q the same for each event and constant throughout the trials 
The case corresponds to the tossing of ideally perfect coins (homo- 
geneous circular discs), or the throwing of ideally perfect dice 
(homogeneous cubes) 

2 If we deal with one event only, we expect in N trials, 
failures and N'p successes Suppose we now combine with the 
results of this first event the results of a second The two events 
are quite independent, and thereforea according to the rule of 

291 



292 


THEORY OP STATISTICS. 




XT. — ^BINOMIAL DISTEIBUTION AND NORMAL CURVE. 293 


independence, of the Nq failures of the first event {N<^q will be 
associated (on an average) with failures of the second event, and 
(Nq)^ with successes of the second event (cf row 2 of the scheme 
on p. 292) Similarly of the Np successful first events, {Bp)q will 
be associated (on an averaige) with failures of the second event 
and {Np)p with successes In trials of two events we would 
therefore expect approximately cases of no success, 2iVpg 
cases of one success and one failure, and cases of two successes, 
as m row 3 of the scheme. The results of a third event may be 
combined with those of the first two in precisely the same way. 
Of the N(^ cases m which both the first two events failed, {l^(f)q 
will be associated (on an average) with failure of the third also, 
with success of the third Of the ^Npq cases of one 
success and one failure, {2Npq)q will be associated with failure 
of the third event and {^J^pq)p with success, and similarly for 
the cases in which both the first two events succeeded. The 
result IS that in N trials of three events we should expect Nq^ 
cases of no success, 3 Mpq^ cases of one success, 3 Np'^q cases of two 
successes, and I^p^ cases of three successes, as in row 5 of the 
scheme. The scheme is continued for the results of a fourth 
event, and it is evident that all the results are included under a 
very simple rule the frequencies of 0, 1, 2 . . . successes are 

given 

for one event by the binomial expansion of l!^{q+p) 
for two events „ „ ^{q 

for three events „ „ F{q 

for /oi^r events „ „ ^{q^py 

and so on Quite gen ei ally, in fact : — the frequencies o/0, 1, 2 . . 
successes in iT trials of n events are given hy the successive teoms 
in the binomial expansion of R{q + p)% viz — 




This IS the first theoretical expiession that we have obtained for 
the form of a fiequency-distiibution 

3 The general foim of the distributions given by such 
binomial series will have been evident from the expeiimental 
examples given in Chaptei XIII , i e they are distiibutions 
of greater or less asymmetry, tailing ofP in either direction 
from the mode The distribution is, however, of so much 
importance that it is woith while considering the form m 
greater detail This form evidently depends (1) on the values 
of q and p, (2) on the value of the exponent n If p and q 
are equal, evidently the distribution must be symmetrical, for 



294 


THEORY OF STATISTICS. 


p and q may be interchanged without altering the value of 
any term, and consequently terms equidistant from either 
end of the series are equal If p and q are unequal, on the 
other hand, the distribution is asymmetrical, and the more 
asymmetrical, for the same value of the gi eater the inequality 
of the chances The following table shows the calculated 
distiibutions for 71 = 20 and values of p, proceeding by 0 1, 
from 0 1 to 0 5 When^ = 0 1, cases of two successes are the 


Terms of the Bxnomial Senes 10,000 for Values of p 

from 0 1 <0 0 5 {Figures given to the nearest unit ) 


I^umber of 

i? = 0 1 

p = 0 2 

17=0 3 

17 = 0 4 

1? = 0 5 

Successes 

3 = 0 9 

3=0 8 

3 = 0 7 

3 = 0 6 

3 = 0 5 

0 

1216 

115 

8 



1 

2702 

576 

68 

5 

— 

2 

2852 

1369 

278 

31 

2 

3 

1901 

2054 

716 

123 

11 

4 

898 

2182 

1304 

350 

46 

5 

319 

1746 

1789 

746 

148 

6 

89 

1091 

1Q16 

1244 

370 

7 

20 

545 

1643 

1659 

739 

8 

4 

222 

1144 

1797 

1201 

9 

1 

74 

654 

1597 

1602 

10 

— 

20 

308 

1171 

1762 

11 

— 

5 

120 

710 

1602 

12 

— 

1 

39 

355 

1201 

13 

— 

— 

10 

146 

739 

14 

— 

— 

2 

49 

370 

15 

— 

— 

— 

13 

148 

16 

— 

— 

— 

3 

46 

17 

— 

— 

— 

— 

11 

18 

19 

— 

— 

— 

— 

2 

20 ' 


_ 

— 


— 


most frequent, but cases of one success almost equally frequent * 
even nine successes may, however, occur about once in 10,000 
trials As p is increased, the position of the maximum 
frequency gradually advances, and the two tails of the distribution 
become more nearly equal, until ^ = 0.5, w^hen the distiibution 
IS symmetrical Of course, if the table were continued, the 
distnbution for jt? = 0 6 would be similar to that for q = 0 6, 
but reversed end for end, and so on. Since the standard- 
deviation is {npq)^ and the maximum value of pq is given by 
p — q, the symmetrical distribution has the greatest dispersion. 




XV.— BINOMIAL DISTRIBUTION AND NORMAL CURVE. 295 


If p — q the effect of increasing n is to raise the mean and 
inciease the dispersion If p is not equal to however, not 
only does an increase m n raise the mean and increase the 
dispersion, but it also lessens the asymmetry; the greater 
71, for the same value of p and the less the asymmetry. 
Thus if we compare the first distiibution of the above table 
with that given by 7i= 100, we have the following : — 


B — Terms of the Binomial Series 10,000 (0 9+0*1)^®®. (Figures given 
to the nearest unit ) 


Number 

of 

Successes. 

Frequency 

Number 

of 

Successes 

Fiequency. 

Number 

of 

Successes 

Frequency 

0 

_ 

8 

1148 

16 

193 

1 

3 

9 

1304 

17 

106 

2 1 

16 

10 

1319 

18 

54 

3 

69 

11 

1199 

19 

26 

4 

159 

12 

988 

20 

12 

5 

339 

13 

743 

21 

5 

6 

596 

14 

513 

22 

2 

7 

889 

15 

327 

23 

1 


The maximum frequencies now occur for 9 and 10 successes, 
and the two ‘‘ tails ” are much more nearly equal If, on the 
other hand, n is reduced to 2, the distiubution is — 


Number of Successes. Frequency 

0 8100 

1 1800 

2 100 


and the maximum frequency is at one end of the range What- 
ever the values of p and q, if n is only increased sufficiently, the 
distribution may be treated as sensibly symmetrical, the necessary 
condition being (we state this without proof) that p-q shall be 
small compared with the standard-deviation \Jnpq It is left 
to the student to calculate as an exeicise the theoretical distribu- 
tions corresponding to the experimental results cited in Chapter 
XIII (Question 1). 

4 The property of the binomial senes used in the scheme of 
§ 2 for deducing the senes with exponent n from that with 
exponent n-1 leads to two interesting methods — graphical and 
mechanical — foi constructing approximate representations of 




296 


THEORY OF STATISTICS. 


binomial distributions It will have been noted that any one 
term — say the rth — in one series is obtained by taking q times the 
rth term together withjp times the (r-l)th teim of the pieceding 
senes Now if AP^ OR {figure 46) be two verticals, and a third, 
BQ^ be erected between them, cutting PR m Q, so that 
AB' BG V then 

BQ^p AP-\-q OR. 

(This follows at once on joining AR and considering the two 
segments into which BQ is divided ) Consider then some 
binomial, say for the casej?? = |-, q = \. Diaw a series of verticals 
(the heavy verticals of fig 47) at any convenient distance apart 


P 



on a horizontal base line, and erect other veiticals (the lighter 
verticals) dividing the distance between them in the ratio of 
q :p, VIZ, 3 . 1. Next, choosing a vertical scale, dtaw the binomial 
polygon for the simplest case = 1 , in the diagram R has been 
taken — 4096, and the polygon is abed, oh — 3072, Ic = 1024 The 
polygons for higher values of n may now be consti noted graphi- 
cally. Mark the points where ah, he, cd respectively cut the 
intermediate verticals and project them horizontally to the right 
on to the thick verticals. This gives the polygon ah'c'd'e for 
w == 2, For oh' = q oh, Ic' ^p,oh + qAc, and so on Similarly, if the 
points where ah', h'e, etc , cut the intermediate verticals are 
projected horizontally on to the thick verticals, we have the 
polygon ah"c"d"e"f" for 7i = 3 The process may be continued 



009Q 


XV. — BINOMIAL DISTRIBUTION AND NORMAL CURVE, 297 


indefinitely, though it will be found difi&cult to maintain any 
high degree of accuracy after the first few constructions. 



5 The mechanical method of constructing the representation of 
a binomial series is indicated diagrammatically by fiig 48. The 


Fig 47. —Graphical Construction of Bmomul Polygons for successive values of n : JV= 4096, i? = i, ? = } 


298 


THEORY OF STATISTICS. 


apparatus consists of a funnel opening into a space — say a J inch in 
depth— between a sheet of glass and a back-board. This space is 
broken up by successive rows of wedges like 1, 2 3, 4 5 6, etc , which 
will divide up into streams any granular material such as shot or 
mustard seed which is poured through the funnel when the 
apparatus is held at a slope. At the foot these wedges are 
replaced by veitical strips, m the spaces between which the 



Fro. 48 — The Pearson-Galton Binomial Apparatus 

material can collect Consider the stream of material that 
comes from the funnel and meets the wedge 1 This wedge is 
set so as to throw q parts of the stream to the left and jd parts 
to the right (of the observer) The wedges 2 and 3 are set so as 
to divide the resultant streams in the same proportions. Thus 
wedge 2 throws parts of the original material to the left and 
qp to the right, wedge 3 throws parts of the original material 
to the left and to the right The streams passing these wedges 
are therefore in the ratio of ^qp p"^ The next row of wedges 
IS again set so as to divide these streams m the same propoitions 




XV — BINOMIAL DISTEIBUTION AND NOKMAL CUEVE. 299 


as before, and the four streams that result will bear the propoi- 
tions . Sq^p 3qp^ p^ The final set, at the heads of the 
vertical strips, will give the sti earns proportions q^ 4:q^p Qq^p^ 
4cqp^ p\ and these streams will accumulate between the stiips 
and give a representation of the binomial by a kind of histogiam, 
as shown Of course as many rows of wedges may be provided 
as may be desiied 

This kind of appaiatus was originally devised by Sir Francis 
Gal ton (ref 1) m a form that gives roughly the symmetrical 
binomial, a stream of shot being allowed to fall through rows of 
nails, and the resultant streams being collected in partitioned 
spaces The apparatus was generalised by Professor Pearson, 
who used rows of wedges fixed to movable slides, so that they 
could be adjusted to give any ratio of q p. (Ref 13,) 

6 The values of the mean and standard-deviation of a binomial 
distribution may be found from the terms of the series directly, 
as well as by the method of Chap XIII. (the calculation was 
m fact given as an exercise m Question 8, Chap. VII , and 
Question 6, Chap VIII ) Arrange the terms under each other 
as m col 1 below, and treat the problem as if it were an arith- 
metical example, taking the aibitrary origin at 0 successes as 
iFis a factor all through, it may be omitted for convenience. 


(1) 

(2) 

(3) 

Frequency /, 

Dev. 1 

n- 


0 

— 

n 

1 



2 

n{n -l)(p-^p^ 

1.2.3 2 ^ 

3 

n{n - 1 )(n ~ 2) 

1.2 2"" 


(4) 


n 


271(71 - 

P2 


The sum of col 1 is of course unity, i.e, we are treating IF as 
unity, and the mean is therefore given by the sum of the terms 
m col. (3) But this sum is 

= np{q = np 

That IS, the mean M is np^ as by the method of Chap. XIII . 



300 


THEORY OF BTATISTICS. 


The square of the standard-deviation is given by the sum of 
the terms in col. (4) less the square of the mean, that is, 

(T^ -Tij? I + 2{n- l)q^--p + 3 ^ ^ " ~ + . . j.-7l22?2 

But the seiies in the bracket is the binomial series (q+py"^ 
with the successive terms multiplied by 1, 2, 3, , It theiefore 
gives the difference of the mean of the said binomial from-1, 
and its sum is therefore (n -l)p + l Therefore 

0-2 = np{(n - 1)^ + 1 } - n^p"^ 

==np - np‘^ = npq, 

7 The terms of the binomial series thus afford a means of 
completely desoribmg a certain class of frequency-distributions — 
%.e. of giving not merely the mean and standard-deviation in 
each case, but of describing the whole form of the distribution 
If K samples of n cards each be drawn from an indefinitely large 
record of caids marked with A or a, the proportion of ii-cards 
in the record being p^ then the successive terms of the series 

p)” give the frequencies to be expected m the long run of 
0, 1, 2, . . ^-cards in the sample, the actual frequencies only 
deviating from these by errors which are themselves fluctuations 
of sampling. The three constants if, jp, ti, therefore, determine 
the average or smoothed form of the distribution to which actual 
distributions will more or less closely approximate. 

Considered, however, as a formula which may be generally 
useful for descnbmg frequency-distributions, the binomial series 
suffers from a serious limitation, viz. that it only applies to a 
stnctly discontinuous distribution like that of the number of 
^-oards drawn from a record containing A's and a’s, or the number 
of heads thrown in tossing a coin The question arises whether 
we can pass from this discontinuous formula to an equation 
suitable for representing a continuous distribution of frequency 

8 Such an equation becomes, indeed, almost a necessity for 
certain cases with which we have already dealt Consider, for 
example, the frequency-distribution of the number of male births 
m batches of 10,000 births, the mean number being, say, 5100. 
The distribution will be given by the terms of the series 
(0 49 + 0 51)^®°^° and the standard-deviation is, m round numbers, 
50 births The distribution will therefore extend to some 150 
births or more on either side of the mean number, and m order 
to obtain it we should have to calculate some 300 terms of a 
binomial senes with an exponent of 10,000 ^ This would not 
only be practically impossible without the use of certain methods 
of approximation, but it would give the distribution in quite 



XV . — BINOMIAL DISTRIBUTION AND NORMAL CUEYB. 301 


unnecessary detail : as a matter of practice, we would not have 
compiled a frequency-distribution by single male births, but 
would certainly have grouped our obseivations, taking probably 
10 births as the class-interval We want, therefore, to replace the 
bmomial series by some continuous curve, having approximately 
the same ordinates, the curve being such that the area between 
any two ordinates and will give the frequency of observations 
between the corresponding values of the variable x-^ and X 2 

9. It is possible to find such a continuous limit to the bmomial 
series for any values of p and q, but in the present work we tv ill 
confine ourselves to the simplest case in which p = 5 ^ = 0 5, and the 
binomial is symmetrical The terms of the senes are 


The frequency of m successes is 

I n 




\m\n-m 


and the frequency of m + 1 successes is derived from this by 
multiplying it by {n -m)j{7n-\‘\) The latter frequency is 
therefore gi eater than the former so long as 


n — m'>m + 1 

71 ~ 1 

or m<-vr. 


Suppose, for simplicity, that n is even, say equal to 2^ , then the 
frequency of h successes is the greatest, and its value is 

\2h 

• • ■ • (1) 

The polygon tails off symmetrically on either side of this greatest 
ordinate. Consider the frequency olh-\-x successes ; the value is 


and therefore 


12 ^ 




W(^-l)(^-2) . . . (i-x+1) 

Vo + 3) . . . + a:) 



('-!) 


1 1 


(-S 

(-1) 






. ( 2 ) 


. ( 3 ; 



302 


THEORY OF STATISTICS. 


Now let US approximate by assuming, as suggested in § 8, that 
h is very large, and indeed large compared with x, so that {xjlcf 
may be neglected compared with This assumption does 

not involve any difficulty, for we need not consider values of x 
much greater than three times the standard-deviation or 3 
and the ratio of this to A; is 3/ ^/^, which is necessarily small if h 
be large. On this assumption we may apply the logarithmic 
series 


82 8 ® 8 ^ 

log,(l + 8) = S-i + |-^ + 


to every bracket m the fraction (3), and neglect all terms beyond 
the first. To this degree of approximation, 


logg. 


Therefore, finally, 


’ j(l Hh 2 -h 3 -4- 

x(x-\) X 
k k 


' k 






20 -^ 


. ( 4 ) 


where, m the last expression, the constant k has been replaced by 
the standard-deviation <t, for o-^ = A:/2 

The curve represented by this equation is symmetrical about 
the point ir = 0, which gives the greatest oidinate Mean, 

median, and mode therefore coincide, and the curve is, in fact, that 
drawn in fig 5, p 89, and taken as the ideal foim of the symmetri- 
cal frequency-distiibution in Chap VI The curve is geneially 
known as the normal curve of errors or of frequency, or the law 
of error 

10. A normal curve is evidently defined completely by giving 
the values of and o- and assigning the origin of x If we 
desire to make a normal curve fit some given distribution as near 
as may be, the last two data are given by the standard-deviation 
and the mean respectively , the value of will be given by the 
fact that the areas of the two distributions, or the numbers of 
observations which these areas represent, must be the same 

This condition does not, however, lead in any simple and 
elementary algebiaic way to an expression for though such 
a value could be found arithmetically to any desired degree 
of approximation For it is evident that (1) any alteration m 



XV.— BINOMIAL DISTRIBUTION AND NORMAL CURVE. 303 


produces a proportionate alteration in the area of the curve, 
eg, doubling doubles every ordinate and tbeiefore doubles 
the aiea. (2) any alteration in cr produces a proportionate 
alteration in the area, for the values of are the sanae for the 
same values of xja-, and therefore doubling a doubles the distance 
of every ordinate from the mean, and consequently doubles the 
area The area of the curve, or the number of observations 
represented, is therefore proportional to y^cr, or we must have 

where a is a numerical constant The value of a may be found 
approximately by taking y^ and o* both equal to unity, calculating 
the values of the ordinates y* for equidistant values of and 
taking the area, or number of observations N, as given by the 
sum of the ordinates multiplied by the interval. 

11 The table below gives the values of y for values of x 
proceeding by fifths of a unit , the values are, of course, the same 
for positive and negative values of x For the tuhole curve the 
sum of the ordinates will be found to be 12 53318, the interval 
being 0 2 units, the area is therefore, approximately, 2*50664, 


Ordinates of the Curve y=e~ {For references to more extended 
tables^ see list on pp 357-8 ) 


X, 

V- 

Logy. 

X. 

y* 

Log y. 

0 

1 00000 

0 

2 6 

03405 

2 53209 

0 2 

*98020 

1 99131 

2 8 

01984 

2-29757 

0 4 

92312 

1 96526 

3 0 

01111 

2 04567 

0 6 

83527 

1 92183 

3 2 

•00598 

3 77641 

0 8 

•72615 

1 86103 

3 4 

•00309 

3 48978 

1 0 

•60653 

1 78285 

3 6 

•00153 

3*18577 

1*2 

•48675 

1-68731 

3 8 

•00073 

4 86439 

1 4 

*37531 

1 57439 

4-0 

•00034 

4*52564 

1 6 

27804 

1-44410 

4 2 

•00015 

4 16952 

1 8 

•19790 

1 29644 

4 4 

00006 

5*79603 

2 0 

13534 

1*13141 

4*6 

*00003 

6 40516 

2 2 

•08892 

2^94901 

4 8 

•00001 

6 99693 

2*4 

•05614 

2*74923 

5 0 

00000 

6-57132 


and this is the approximate value of a. The value is more than 
sufficiently accurate for practical purposes, for the exact value 
is \/2ff=:2 506627 .... The proof of this value cannot be given 
here, but it may be deduced from an important approximate 
expression for the factorials of large numbers, due to James 




304 


THEORY OP STATISTICS 


Stirling (1730), If n be large, we have, to a high degree of 
approximation, 

Applying Stirling’s theorem to the factorials m equation (1) we 
have 

^ > 

The complete expression for the normal curve iff therefore 




N' 

^ 2<ri 

^/ 27 r cr 


. ( 6 ) 


The exponent may be written x^jc^ where c= \/2 cr, and this is 
the origin of the use of \/2xcr (the “modulus”) as a measure 
of dispersion, of 1/ sf2 cr as a measure of “ precision,” and of 2cr^ 
as “the fluctuation” (c/ Chap YIII. § 13). The use of the factor 
2 or \/2 becomes meaningless if the distribution be not wyrmal 

Another rule cited in Chap VIII , viz that the mean deviation 
is approximately 4/5 of the standaid-deviation, is strictly true 
for the normal curve only. For this distribution the mean 
deviation = <r = 0 79788 . . . or: the proof cannot be given 

within the limitations of the present work The rule that a 
range of 6 times the standard-deviation includes the great 
majority of the observations and that the quartile deviation is 
about 2/3 of the standard-deviation were also suggested by the 
properties of this curve (see below §§ 16, 17). 

12 In the pi oof of § 9 the assumption was made that h (the 
half of the exponent of the binomial) was very large compaied 
with X (any deviation that had to be considered) In point 
6f fact, however, the noimal curve gives the terms of the 
symmetrical binomial surprisingly closely even for moderate 
values of n. Thus if w=64, A = 32, and the standard-deviation 
is 4 Deviations x have therefore to be considered up to ±12 
or more, which is over 1/3 of ^ As will be seen, however, from 
the annexed table, the ordinates of the normal curve agree with 
those of the binomial to the nearest unit (m 10,000 observations) 
up to ±15 The closeness of approximation is partly due 
to the fact that, in applying the logarithmic series to the 
fraction on the right of equation (3), the terms of the second 
order in expansions of corresponding brackets in numerator and 
denominator cancel each other, these terms, therefore, do not 



XY. — BINOMIAL DISTRIBUTION AND NORMAL CURVE 305 


accumulate, but only the terms of the third order. There is 
only one second-order term that has been neglected, viz that due 
to the last bracket m the denominator. Even for much lower 
values of n than that chosen for the illustration — 10 or 12 
(c/ Qu. 4 at the end of this chapter) — the normal curve still 
gives a very fair approximation. 


Table shomng (1) Ordinates of the Binomial Senes 10,000 (| + 


(2) Corresponding Ordinates of the Normal Curve y ==> 


10,000 - 
— e 
4\/27r 


Term. 

Binomial 

Senes 

Normal 

Curve. 

Term. 

Binomial 

Series 

Koimal 

Curve 

32 

993 

997 

24 and 40 

136 

135 

31 and 33 

963 

967 

23 „ 41 

80 

79 

30 „ 34 

878 

880 

22 „ 42 

44 

44 

29 ,, 35 

753 

753 

21 „ 43 

23 

23 

28 „ 36 

606 

605 

20 „ 44 

11 

11 

27 „ 37 

459 

457 

19 „ 45 

5 

5 

26 ,, 38 

326 

324 

18 „ 46 

2 

2 

25 „ 39 

217 

216 

17 „ 47 

1 

1 


13 But if the normal curve were limited in its application to 
distributions which were certainly of binomial type, its use in 
practice (apart from its theoretical applications to many cases of 
the theory of sampling) would be very restricted. As suggested, 
however, by the illustrations given in Chap YL, a certain, though 
not a large, number of distributions — more particularly among 
those relating to measuiements on man and other animals — are 
approximately of normal form, even although such distributions 
have not obviously originated in the same way as a binomial 
distribution Take, for example, the distribution of statures in 
the United Kingdom (Chap YI , Table VI ) The mean stature 
IS 67 46 inches, the standard-deviation 2 57 inches (the values are 
worked out in the illustrations of Chaps YII. and YIII.), and the 
number of observations 8585. This gives ^^ = 1333, and all the 
data necessary for plotting a normal curve of the same mean and 
standard-deviation (the process of fitting is dealt with at greater 
length m § 14 below) The two distributions are shown together 
in fig. 49, the continuous curve being the normal curve, and the 
small circles showing the observed frequencies It is evident that 
they agree very closely. Other body measurements, e.g skull 
measurements, etc , also follow the normal law , it also applies to 
certain characters in plants (e g. number of seeds per capsule in 

20 




306 


THEOEY OF STATISTICS. 


Nelumbium, Pearl, American Natuialist^ Nov 1906). The question 
arises, therefore, why, m such cases, the distribution should be 
approximately normal, a form of distribution which we have only 
shown to arise if the variable is the sum of a large number of 
elements, each of which can take the values 0 and 1 (or other two 
constant values), these values occurring independently, and with 
equal frequency 

In the first place, it should be stated that the conditions of the 
deduction given in § 9 were made a little unnecessarily restricted, 



stature w inches 


49 — The Distribution of Stature for Adult Males in the British Isles 
(fig. 6, p. 89), fitted with a ITormal Curve to avoid confusing the 
figure, the frequency-polygon has not been drawn m, the tops of the 
ordinates being shown by small circles 

with a view to securing simplicity of algebra. The deduction 
may be generalised, whilst retaining the same type of proof, by 
assuming that p and q are unequal (provided p-q he small 
compared with s/npq, cf, § 3), that p and q are not quite the 
same for all the events, that all the events are not quite inde- 
pendent, or that n is not large, but that some sort of continuous 
variation is possible in the values of the elementary variables, 
these being no longer restricted to 0 and 1, or two other discrete 
values (Cf the deduction given by Pearson in ref 13 ) Pro- 
ceeding further from this last idea, the deduction may be rendered 



XT. — BINOMIAL DISTRIBUTION AND NORMAL CURTE. 307 


more general still, without introducing the conception of the 
binomial at all, by founding the curve on more or less complex 
cases of the theory of sampling for variables instead of for attri- 
butes If a variable is the sum (or, within limits, some slightly 
more complicated function) of a large number of other variables, 
then the distribution of the compound or resultant variable is 
normal, provided that the elementary variables are independent, 
or nearly so (cf. ref. 6) The forms of the frequency-distribu- 
tions of the elementary variables affect the final distribution less 
and less as their number is mci eased . only if their number is 
moderate, and the distributions all exhibit a comparatively high 
degree of asymmetry of uniform sign, will the same sign of 
asymmetry be sensibly evident in the distribution of the compound 
variable On this sort of hypothesis, the expectation of normality 
in the case of stature may be based on the fact that it is a highly 
compound character — depending on the sizes of the bones of the 
head, the vertebral column, and the legs, the thickness of the 
intervening cartilage, and the curvature of the spine — the elements 
of which it IS composed being at least to some extent independent, 
%e by no means perfectly correlated with each other, and their 
frequency-distiibutions exhibiting no very high degree of asym- 
metry of one and the same sign The comparative rarity of 
normal distributions m economic statistics is probably due in part 
to the fact that in most cases, while the entire causation is 
ceitamly complex, relatively few causes have a largely predominant 
influence (hence also the frequent occurrence of irregular 
distributions in this field of work), and in part also to a high 
degree of asymmetry in the distributions of the elements on which 
the compound variable depends. Errors of observation may m 
general be regarded as compounded of a number of elements, due 
to various causes, and it was in this connection that the normal 
curve was first deduced, and received its name of the curve of 
errors, or law of error 

14. If it be desired to compare some actual distribution 
with the normal distribution, the two distributions should be 
superposed on one diagram, as in fig 49, though, of course, on 
a much larger scale. When the mean and standard-deviation 
of the actual distribution have been determined, is given by 
equation (5) ; the fit will probably be slightly closer if the 
standard-deviation is adjusted by Sheppard's correction (Chap 
XI § 4) The normal curve is then most readily drawn by plot- 
ting a scale showing fifths of the standard-deviation along the 
base line of the frequency diagram, taking the mean as origin, 
and marking over these points the ordinates given by the figures 
of the table on p. 303, multiplied in each case by y^ The curve 



308 


THEOET OF STATISTICS. 


can be drawn freehand, or by aid of a curve ruler, through the 
tops of the ordinates so determined. The logarithms of y in the 
table on p 303 are given to facilitate the multiplication. The only 
point m which the student is likely to find any difficulty is 
in the use of the scales - he must be careful to remember 
that the standard-deviation must he expressed in terms of the 
class-interval as a mdt in order to obtain for y^ a number of 
ohservatiom per interval comparable with the frequencies of his 
table 

The process may be varied by keeping the normal curve 
drawn to one scale, and redrawing the actual distribution 
so as to make the area, mean, and standard-deviation the 
same Thus suppose a diagram of a normal curve was printed 
once for all to a scale, say, of y^ — ^ inches, or = l inch, and 
it were required to fit the distribution of stature to it. 
Since the standard-deviation is 2 57 mches of stature, the 
scale of stature is 1 inch == 2 57 inch of stature, or 0 389 mches 
= 1 inch of stature ; this scale must be drawn on the base of the 
normal-curve diagram, being so placed that the mean falls 
at 67 46. As regards the scale of frequency-per-interval, this 
is given by the fact that the whole area of the polygon showing 
the actual distribution ^ust be equal to the area of the 
normal curve, that is 5 \/2:n- = 12*53 square inches If, therefore, 
the scale required is n observations per mteival to the mch, 
we have, the number of obseivations being 8585, 


8585 

7^x257 


= 12 53, 


which gives = 266 6. 

Though the second method saves curve drawing, the first, 
on the whole, involves the least arithmetic and the simplest 
plotting. 

15. Any plotting of a diagram, or the equivalent arithmetical 
comparison of actual frequencies with those given by the 
fitted normal distribution, affords, of course, m itself, only a 
rough test, of a practical kind, of the normality of the given 
distribution The question whether all the observed differences 
between actual and calculated frequencies, taken together, 
may have arisen merely as fluctuations of sampling, so that the 
actual distribution may be regarded as strictly normal, neglecting 
such errors, is a question of a kind that cannot be answered in 
an elementary work {cf. ref 22). At present the student is in 
a position to compare the divergences of actual from calculated 
frequencies with fluctuations of sampling in the case of single 
class-intervals, or single groups of class-intervals only. If the 



XV. — BINOMIAL DISTBIBtJTION AND NORMAL CURVE. 309 


expected theoretical frequency in a cert ain interval is the 
standard error of sampling is ^(N - y)/N , and if the divergence 
of the observed from the theoretical frequency exceed some 
three times this standard erroi, the divergence is unlikely to 
have occurred as a mere fluctuation of sampling. 

It should be noted, however, that the ordinate of the normal 
curve at the middle of an interval does not give accurately the 
area of that interval, or the number of observations withm it it 
would only do so if the curve were sensibly straight To deal 
strictly with problems as to fluctuations of sampling in the 
frequencies of single intervals or groups of intervals, we require, 
accordingly, some convenient means of obtaining the number of 
observations, in a given normal distribution, lying between any 
two values of the variable 

16. If an ordinate be erected at a distance soJct from the mean, 
in a normal curve, it divides the whole area into two parts, the 
ratio of which is evidently, from the mode of construction of the 
curve, independent of the values of and of cr The calculation 
of these fractions of area for given values of ^r/cr, though a long 
and tedious matter, can thus be done once for all, and a table 
giving the-iesults is useful for the purpose suggested m § 15 and 
in many other ways References to complete tables aie cited at 
the end of this work (list of tables, pp 357-8), the short table below 
being given only for illustrative purposes The table shows the 
greater fi action of the area lying on one side of any given ordinate ^ 
eg 0 53983 of the whole area lies on one side of an ordinate at 
0 Icr from the mean, and 0 46017 on the other side. It will be 
seen that an ordinate drawn at a distance from the mean equal to 
the standard-deviation cuts off some 16 per cent, of the whole 
area on one side , some 68 per cent of the area will therefore be 
contained between ordinates at ±(r An ordinate at twice the 
standard-deviation cuts off only 2 3 per cent , and therefore some 
95 4 per cent, of the whole area lies within a range of ± 2cr At 
three times the standard-deviation the fraction of area cut off is 
reduced to 135 parts in 100,000, leaving 99 7 per cent within a 
range of ± 3cr, This is the basis of our rough rule that a range 
of 6 times the standard-deviation will in general include the 
great bulk of the observations * the lule is founded on, and is only 
strictly true for, the normal distribution For other forms of 
distribution it need not hold good, though experience suggests 
that it more often holds than not The binomial distribution, 
especially ifp and q be unequal, only becomes approximately normal 
when n is large, and this limitation must be remembered in applying 
the table given, or similar more complete tables, to cases in which 
the distribution is strictly binomial 



310 


THEOEY OF STATISTICS. 


Table s7ioiyi?7gr tJu Gi eater FioLction of the Area of a Normal Ci(,rve to One 
Side of an Ordinate of Abscissa xjff {For teferences to more extended 
tables, see list on pp 357-8. ) 


xjff 

Greater 
Fraction of 
Area 

xja 

Gi eater 
Fraction of 
Area 

0 

•50000 

2 1 

98214 

0 1 

53983 

22 

98610 

0 2 

57926 

2 3 

98928 

0 3 

61791 

24 

99180 

0 4 

•65542 

2 5 

99379 

0 5 

69146 

2 6 

99534 

06 

'72575 

27 

99653 

0 7 

75804 

2 8 

99744 

0 8 

7S814 

29 

99813 

0 0 

81594 

3 0 

•99865 

1 0 

84134 

3*1 

99903 

1 1 

86433 

3 2 

99931 

1 2 

88493 

3 3 

'99952 

1 3 

•90320 

34 ' 

9^1966 

1 4 

91924 

35 ; 

99977 

1 5 

93319 

3 6 ! 

99984 

1 6 

94520 

3 7 

09989 

1 7 

95543 

3 8 

99993 

1 8 

96407 

3 9 

99995 

1 9 

97128 

40 

99997 

2 0 

97725 

4 1 

99998 


17 If we tiy to deteimme the quaitile deviation m teims of 
the standaid-deviation from the table, we see that it lies between 
0 6 and 0 7o-. Intel polating, it is given approximately by 

More exact interpolation gives the value 0 67 448975cr. This result, 
again, is the foundation of the rough lule that the semi-inter- 
qiiaitile range is usually some 2/3 of the standaid-deviation . it is 
strictly true for the noima^ cuive only. It may be noted that 
the constant 0 67448975 can be determined by processes of 

interpolation only, and cannot be expiessed exactly, like the 
mean deviation, in teims of any other known constant, such 
as TT 

It has become customary to use 0 674: . . . . times the standard 
error rather than the standard error itself as a measure of the 




XV. — BINOMIAL DISTEIBUTION AND NORMAL CURVE. 311 


unreliability of observed statistical results, and the term probable 
error is given to this quantity It should be noted that the word 
“probable” is hardly used in its usual sense in this connection 
the probable error is merely a quantity such that we may expect 
greater and less errors of simple sampling with about equal 
frequency, provided always that the distribution of errors is 
normal On the whole, the use of the “ probable error ” has little 
advantage compared wuth the standaid, and consequently little 
stress IS laid on it m the present work , but the term is m constant 
use, and the student must be familiar with it 

It is true that the “ probable error ” has a simpler and more direct 
significance than the standard error, but this advantage is lost as 
soon as w^e come to deal with multiples of the probable error. 
Further, the best modern tables of the ordinates and area of the 
normal curve are given in teims of the standard-deviation or 
standaid error, not in terms of the probable error, and the mul- 
tiplication of the former by 0 6745, to obtain the probable error, 
IS not justified unless the distribution is normal For very large 
samples the distribution is approximately normal, even though p 
and q are unequal , but this is not so foi small samples, such as 
often occur m practice In the case of small samples the use of 
the “probable eiror” is consequently of doubtful value, while the 
standard error retains its significance as a measure of dispersion 
The “ probable error,” it may be mentioned, is often stated after 
an observed proportion with the ± sign before it , a percentage 
given as 20 5 ± 2 3 signifying “ 20 5 per cent , with a probable 
erior of 2 3 per cent.” 

If an error or deviation in, say, a certain proportion p only just 
exceed the probable error, it is as likely as not to occur in simple 
sampling if it exceed twice the probable error (in either direction), 
it is likely to occur as a deviation of simple sampling about 1 8 
times m 100 trials — or the odds are about 4 6 to 1 against its 
occurring at any one trial For a range of three times the probable 
error the odds are about 22 to 1, and for a range of four times the 
probable error 142 to 1. Until a deviation exceeds, then, 4 times 
the probable error, we cannot feel any great confidence that it is 
likely to be “significant ” It is simpler to work with the standard 
error and take ± 3 times the standard error as the critical range . 
for this range the odds are about 370 to 1 against such a devia- 
tion occurnng m simple sampling at any one trial 

18, TETe following are a few miscellaneous examples of the use 
of the normal curve and the table of areas. 

Example i — A bundled coins are thrown a number of times 
How often approximately in 10,000 throws may (1) exactly 65 
heads, (2) 65 heads or more, be expected 1 



M2 


•r&EORY ot statistics. 


The standard-deviation is n/05x05x100 — 5. Taking the 
distribution as normal, ^^ = 797 9 

The mean number of heads being 50, 65-50 = 3<r The 
frequency of a deviation of So* is given at once by the table (p 303) 
as 797*9 x 0111 ... =8 86, or nearly 9 throws in 10,000 A 

throw of 65 heads will therefore be expected about 9 times 

The frequency of throws of 65 heads or more is given by the 
area table (p 310), but a little caution must now be used, owing 
to the discontinuity of the distribution. A throw of 65 heads is 
equivalent to a range of 64 5-65 5 on the continuous scale of the 
normal curve, the division between 64 and 65 coming at 64 5. 
645 — 50=+29a', and a deviation of H- 2 9 cr or more, will only 
occur, as given by the table, 187 times in 100,000 throws, or, say, 
19 times m 10,000 

Example ii — Taking the data of the stature-distribution of fig. 
49 (mean 67 46, standard-deviation 2 57 in ), what proportion of 
all the individuals will be within a range of ± 1 inch of the 
meani 

1 inch =0 389cr. Simple mteipolation in the table of p, 310 
gives 0 65129 of the area below this deviation, or a more extended 
table the more accurate value 0 65136 Within a range of 
± 0 3890* the fraction of the whole area is theiefore 0*30272, or the 
statures of about 303 per thousand of the given population will lie 
within a range of + 1 inch fiom the mean 
Example iii — In a case of crossing a Mendelian recessive by a 
heterozygote the expectation of recessive offspring is 50 per cent 
(1) How often would 30 recessives or more be expected amongst 50 
offspring owing simply to fluctuations of sampling (2) How many 
offspring would have to be obtained in order to reduce the probable 
error to 1 per cent h 

The standard erro i of the percentage of recessives for 50 
observations is 50n/1/50 = 7 07 Thirty recessives m fifty is 
a deviation of 5 from the mean, or, if we take thirty as representing 
29 5 or more, 4 5 from the mean , that is, 0 636 o- A positive 
deviation of this amount or more occurs about 262 times in 1000, 
so that 30 recessives or more would be expected m more than a 
quarter of the batches of 50 offspring We have assumed 
normality for rather a small value of n, but the result is sufficiently 
accurate for practical purposes 
As regards the second part of the question we are to have 

*6745x50 

n being the number of offspring This gives 7i=1137 to the 
nearest unit 



XY. — BINOMIAL DISTRIBUTION AND NORMAL CURVE. 313 


Example ly. — The diagram of fig. 49 shows that the number of 
statures recorded m the group ‘‘62 in and less than 63’^ is 
markedly less than the theoretical value Could such a difference 
occur owing to fluctuations of simple sampling ^ and if so, how 
often might it happen 1 

The actual frequency recorded is 169. To obtain the theoreti- 
cal frequency we may either take it as given roughly by the 
oidmate m the centre of the interval, or, better, use the integral 
table Eemembermg that statures were only recorded to the 
nearest ^ m , the true limits of the interval are 61i|— 62^1-, or 
61 94-62 94, mid-value 62 44. This is a deviation from the 
mean (67 46) of 5 02. Calculating the oidmate of the normal 
curve directly we find the frequency 197 8. This is certainly, as 
is evident from the form of the curve, a little too small- The 
interval actually lies between deviations of 4 52 in and 5 52 
in , that is, 1 759cr and 2 148cr. The corresponding fractions of 
area are 0 96071 and 0 98418, difference, or fraction of area 
between the two ordinates, 0 02347 Multiplying this by the 
whole number of observations (8585) we have the theoietical 
frequency 201 5. 

The difference of theoretical and obseived frequencies is theiefore 
32 5. But the proportion of observations which should fall into 
the given class is 0 023, the proportion falling into other classes 
0 977, and the standar d error of the class frequency is accordingly 
n/O 023 X 0 977 x 8585 = 14 0 As the actual deviation is only 
2*32 times this, it could certainly have occurred as a fluctuation of 
samplmg. 

The question how often it might have occurred can only be 
answered if we assume the distribution of fluctuations of sampling 
to be approximately normal It is true that p and q are very 
unequal, but then n is very large (8585) — so large that the 
difference of the chances is faiily small compared with sjnpq 
(about one-fifteenth). Hence we may take the distribution of 
errors as roughly normal to a first approximation, though a 
first approximation only The tables give 0*990 of the area 
below a deviation of 2 32cr, so we would expect an equal or 
greater deficiency to occur about 10 times in 1000 trials, or once 
m a hundred. 


REFERENCES 
The Binomial Machine. 

(1) Galton, Fkanois, Natural Inheritance ; Macmillan & Co. London, 1889, 
(Mechanical method of forming a bmomial or normal distribution, 
chap v., p. 63; for Pearson’s generalised machine, see below, 
ref 13.) 



314 


THEORY OF STATISTICS. 


Frequency Curves. 

For the early classical memoirs on the normal curve or lavp of error 
by Laplace, Gauss, and others, see Todhunter’s Eistoiy (Introduction 
ref 7) The literatuie of this subject is too extensive to enable us to do 
more than cite a few of the more recent memoiis, of which 6, 7, and 13 
are of fundamental importance The student will hnd other citations 
in 6, 8, and 14. 

(2) Ohabliek, C V L, “Researches into the Theory of Piohability” 

{Communications from the Astronomical Observatory^ Lund) , Lund, 
1906. 

(3) CuKNiNGHAM, E, “The ctf-Functions, a Class of Normal Functions 

occuinng m Statistics,** Froc Boy. Soc., Senes A, vol Ixxxi , 1908, 
p 310. 

(4) Edgeworth, F. Y , “On the Representation of Statistics by Mathema- 

tical Formulae,” Jour Boy Stat Soc , vol. Ixi , 1898 , vol Ixii , 1899 , 
and vol. Ixiii., 1900 

(5) Edgeworth, F Y , Article on the “Law of Error*' in the Encyclopaedia 

Bnimnica, 10th edn , vol xxvui , 1902, p. 280. 

(6) Edgeworth, F Y , “The Law of Error,” Cambridge Phil Trans , vol 

XX , 1904, pp 36-65, 113-141 (and an appendix, pp i-xiv, not 
printed in the Cambridge Phil Trans. ) 

(7) Edgeworth, F Y , “The Generalised Law of Error, or Law of Gieat 

Numbers,” Jour Boy Stat. Soc , vol. Ixix , 1906, p 497 

(8) Edgeworth, F Y , “ On the Repiesentation of Statistical Frequency by 

a Curve,” Jour Boy Stat Soc , vol Ixx , 1907, p 102 

(9) Fechner, G T , KolleJctivmasslehre (heraiisgegeben von G F Lipps , 

Engelmann, Leipzig, 1897 ) 

(10) Kapteyn, J 0., Skew Frequency Curves in Biology and Statistics , 

Noordhoff, Groningen , Wm Dawson & Sons, London, 1903 

(11) Maoalister, Donald, “The Law of the Geometric Mean,” Proc Boy 

Soc , vol XXIX , 1879, p 367 

(12) Nixon, J. W , “An Experimental Test of the Normal Law of Error,” 

Jour Boy Stat. Soc , vol Ixxvi , 1913, pp 702-706 

(13) Pearson, Karl, “ Skew Variation in Homogeneous Material,” Phil. 

Tians Boy Soc , Senes A, vol clxrxvi , 1895, p 343 
For the geneialised binomial machine, see § 1 The memoir deals 
with curves derived from the general binomial, and from a somewhat 
analogous senes derived from the case of sampling from limited 
material Supplement to the memoir, ibid , vol. cxcvu , 1901, p 443, 
For a deiivation of the same curves from a modified standpoint, 
ignoring the binomial and analogous distributions, cf. Chap. X., 
ref 18 

(14) Pearson, Karl, “Das Fehlergesetz und seme Verallgemeinerungen 

durch Fechner und Pearson *’ . A Rejoinder, Biometrika, vol. iv , 1905, 
p. 169. 

(15) Perozzo, Luigi, “ Nuove Applicazioni del Calcolo delle Probability alio 

Studio del Fenomeni Statistici e Distnbuzione dei Matnmoni secondo 
I’Etk degli Sposi,” della Classe dx Scienze moralx, etc.^ Beale 

Accad del Lincei, vol. x , Senes 3, 1882 

(16) Sheppard, W F , “ On the Application of the Theory of Error to Cases 

of Normal Distribution and Normal Correlation,” Phil Trans Boy 
Soc , Senes A, vol oxcii , 1898, p 101. (Includes a geometrical treat- 
ment of the normal curve.) 

(17) Yule, G, U , “ On the Distribution of Deaths with Age when the Causes 
^ of Death act cumulatively, and similar Frequency-distributions,” 



XY. — binomul distribution and normal curve 315 


Jour Roy Stat Soc , vol Ixiiii , 1910, p 26. (A binomial distribu- 
tion with negative index, and the related curve, ^ e a special case of 
one of Pearson’s curves, ref 13 ) 

The Resolution of a Distribution compounded of two N'ormal 
Curves into its Components 

(18) Pbakson, Kael, Contributions to the Mathematical Theory of Evolu- 

tion (on the Dissection of Asymmetrical Frequency Curves),” Phtl, 
Trans Roy Soc , Senes A, vol clxxxv , 1894, p 71 

(19) Edgewokth, F. Y , “On the Representation of Statistics by Mathema- 

tical Formulae,” part ii, Jom Roy Stat Soc , vol Ixii , 1899, p 125. 

(20) Pjeaeson, Kael, “ On some Applications of the Theory of Chance to 

Racial Differentiation,” P/wZ Mag , 6th Series, vol i , 1901, p 110 

(21) Helgueeo, Feenando de, “Pei la nsoluzione delle curve dimorfiche,” 

BtoTnetrtka, vol iv., 1905, p 230. Also memoir under the same title 
m the Transactions of the Reale Accademia dei Lincei, Rome, vol vi., 
1906 (The first is a short note, the second the full memoir ) 

See also the memoir by Charlier, cited in (2), section vi. of that 
memoir dealing with the problem of dissection. 

Testing the Fit of an Observed to a Theoretical or 
another Observed Distribution. 

(22) Peaeson, Kael, “On the Gntenon that a given System of Deviations 

from the Probable, in the Case of a Correlated System of Variables, is 
such that it can be reasonably supposed to have arisen from landom 
sampling,” Phil Mag , 6th Seiies, vol 1 , 1900, p 157 

(23) Peaeson, Kael, “On the Probability that Two Independent Distribu- 

tions of Frequency are really Samples from the same Population,” 
Biometrika, vol. viii., 1911, p. 250 , also Bionici'tika^ vol x , 1914, 
pp. 85-143. 

EXERCISES. 

1. Calculate the theoretical distiibutions for the three experimental cases 
(1), (2), and (3) cited in § 7 of Chapter XIII 

2 Show that if np be a whole number, the mean of the binomial coincides 
with the greatest term. 

3 Show that if two symmetrical binomial distributions of degree n (and 
of the same number of obseivations) are so superposed that the rth term of 
the one coincides with the (r-f-l)th term of the other, the distii bution 
formed by adding superposed terms is a symmetrical binomial of degree Ti-f 1 

[Note; it follows that if two normal distributions of the same aiea and 
standard-deviation are superposed so that the difference between the means is 
small compared with the standard deviation, the compound curve is very 
nearly normal ] 

4 Calculate the ordinates of the binomial 1024 (0 6 + 0 5)^°, and compare 
them with those of the normal curve 

5, Draw a diagram showing the distribution of statures of Cambridge 
students (Chap VI , Table VII ), and a normal curve of the same area, 
mean, and standard-deviation superposed thereon 

6. Compare the values of the semi-in ter quar tile range for the stature 
distributions of male adults in the United Kingdom and Cambridge students, 
(1) as found directly, (2) as calculated from the standaid-deviation, on the 
assumption that the distribution is normal. 



316 


THEORY OF STATISTICS 


7. Tdking the mean stature for the British Isles as 67 46 m. (the dis- 
tribution of fig 49), the mean for Cambridge students as 68*85 in , and the 
common standard-deviation as 2*56 in , what percentage of Cambridge students 
exceed the British mean in stature, assummg the distiibution normal ? 

8 As stated in Chap XllI , Example ii., certain crosses of Ftsum sativum 
based on 7125 seeds gave 25 32 per cent of green seeds instead of the theoretical 
proportion 25 per cent , the standard error being 0 51 per cent In what per- 
centage of experiments based on the same number of seeds might an equal or 
greater percentage be expected to occur owing to fluctuations of sampling 
alone ^ 

9 In what proportion of similar experiments based on (1) 100 seeds, (2) 
1000 seeds, might (a) 30 per cent, or more, (&) 35 per cent, or moie, of green 
seeds, be expected to occur, if ever * 

10 In similar experiments, what number of seeds must be obtained to 
make the “ probable error” of the proportion 1 per cent ? 

11. If skulls are classified as dolichocephalic when the length-breadth 
index is under 75, mesocephahc when the same index lies between 75 and 80, 
and hr achy cephalic when the index is over 80, find approximately (assuming 
that the distribution is normal) the mean and standard-deviation of a senes 
in which 58 per cent are stated to be dolichocephahc, 38 per cent, meso- 
cephalic, and 4 per cent, brachycephahc. 



CHAPTER XVI. 


NORMAL CORRELATION. 

1-3. Deduction of tiie general expression foi the normal correlation surface 
from the case of independence — 4 Constancy of the standard- 
deviations of parallel arrays and lineaiity of the i egression —5. The 
contour lines a series of concentric and similar ellipses — 6. The 
normal surface for two con elated variables legarded as a normal 
surface for uncorrelated variables rotated with respect to the axes of 
measurement, arrays taken at any angle across the suiface are normal 
distributions with constant standard -deviation : distribution of and 
correlation between linear functions of two noimally coiTelated 
vanables are normal principal axes — 7. Standard-deviations round 
the principal axes — 8-11 Investigation of Table III , Chap IX., to 
test normality linearity of regiession, constancy of standard-deviation 
of arrays, normality of distribution obtained by diagonal addition, 
contour lines— 12-13 Isotropy of the normal distiibution for two 
vanables — 14 Outline of the prmcipal properties of the normal dis- 
tribution for n variables 

1. The expression that we have obtained for the “ normal ” dis- 
tribution of a single variable may readily be made to yield a 
corresponding expression for the distribution of frequency of pairs 
of values of two vanables This normal distribution for two 
variables, or “normal correlation surface,’^ is of great historical 
importance, as the earlier work on correlation is, almost with- 
out exception, based on the assumption of such a distribution; 
though when it was recognised that the properties of the correla- 
tion-coefficient could be deduced, as in Chap IX , without reference 
to the form of the distnbution of frequency, a knowledge of 
this special type of frequency-surface ceased to be so essential. 
But the generalised normal law is of importance m the theory of 
sampling it serves to describe very approximately certain actual 
distributions (e g of measurements on man) ^ and if it can be 
assumed to bold good, some of the expressions in the theory of 
correlation, notably the standard-deviations of arrays (and, if 
more than two variables are involved, the partial correlation- 
coefficients), can be assigned more simple and definite meanings 
than m the general case The student should, tbeiefore, be 
familiar with the more fundamental properties of the distribution. 

317 



318 


THEORY OF STATISTICS. 


2 Consider first the case m which the two variables are com- 
pletely independent Let the distributions of frequency for the 
two variables and x^^ singly, be 

/ 2o-| 

1/2^1/26 J 

Then, assuming independence, the frequency-distribution of pairs 
of values must, by the rule of independence, be given by 


where 





1 "2+^) 

y\2^yn^ 

\ <r\ 0 - 2 / 

1 

1 

N 

^ - 

27r OTjCTg 


( 2 ) 

(3) 


Equation (2) gives a normal correlation surface for one special 
case, the correlation-coefficient being zero. If we put ajg = a con- 
stant, we see that every section of the surface by a vertical plane 
parallel to the axis, ^ e the distribution of any array of a’^'s, is 
a normal distribution, with the same mean and standaid-deviation 


as the total distribution of a^^s, and a similar statement holds for 
the array of ag^s , these properties must hold good, of course, as 
the two variables are assumed independent (c/ Chap V § 13) 
The contour lines of the surface, that is to say, lines drawn on 
the surface at a constant height, are a series of similar ellipses 
with major and minor axes parallel to the axes of and iCg and 
proportional to a-j and o-g, the equations to the contour lines being 
of the general form 


4.^ 

of (tI 




• (4) 


Pairs of values of x-^ and x^ related by an equation of this form 
are, therefore, equally frequent 

3 To pass fiom this special case of independence to the general 
case of two correlated variables, remember (Chap XIL § 8) 
that if 

2 = — bi2 ^2 

^2 1 “ ^2 ~ ^21 ^1 

Xj and X 21 , as also Xg and x^^ uncorrelated. If they are not 
merely uncorrelated but completely independent, and if the dis- 



XVI. — NORMAL CORRELATION. 


319 


tribution of each of the deviations singly be normal, we must have 
for Jhe frequency-distribution of pairs of deviations of Xj and j 


But 



• (5) 


^ 1,^1 ^1 , cy ^ 1^2 

Oi ^ O-J 1 ~ of (1 - rfo) cr^l - »r.) - rjj) 

_ ^ , 4 

*— 2 4 “ o ^^12 * 

cr{2 CTji 0*1 2 0^21 


Evidently we would also have arrived at precisely the same 
expression if we had taken the distribution of frequency for 
and g, and reduced the exponent 


^ ^2 

of cr?2 


We have, therefore, the general expiession for the normal 
correlation surface for two variables 


o 






\ OTi 2 O'o 1 


cr (T / 
12 21/ 


( 6 ) 


Fultber, since and X 2 ^ x^ and are independent, we must 
have 




W 




N 


Itt 0"2^i Stt U'2cri2 StTiCTj^ ^ 2(1 


( 7 ) 


4 If we assign to x^ some fixed value, say Ag, we have the 
distribution of the array of of type 


yi2-yi2e 


'M 2 2 /T / 

\0’12 0-21 ‘^12'^2l/ 

4 


= yi2 ^ 2 e 


This is a normal distribution of standard-deviation with a 

cr, 

mean deviating by ^2 mean of the whole distribu- 

“^2 

tion of XjS As ^2 represents any value whatever of x^y we see 
(1) that the standaid-deviations of all airays of aie the same, 



320 


THEORY OF STATISTICS. 


and equal to o-j g J (2) that the regression of on is strictly 
linear. Similarly, of course, if we assign to x-^ any value we 
will find (1) that the standard-deviations of all arrays of x^ are 
the same : (2) that the regression of x^ on x-^ is strictly linear. 



Fig. 50. — Pnneipal Axes and Contour Lines of the normal 
Con elation Surface. 

5. The contour lines are, as in the case of independence, a 
series of concentiio and similar ellipses , the major and minor 
axes are, however, no longer parallel to the axes of x^ and x^, but 
make a certain angle with them Fig. 50 illustrates the calcu- 
lated form of the contour lines for one case, KK and CO being 
the Imes of regression. As each line of regression cuts every 




XVI — NORMAL CORRELATION. 


32i 


array of iUj or of in its mean, and as the distribution of every 
array is symmetrical about its mean, ER must bisect every 
horizontal chord and CC every vertical chord, as illustrated 
by the two chords shown by dotted lines, it also follows that 
ER cuts all the ellipses in the points of contact of the horizontal 
tangents to the ellipses, and CC m the points of contact of 
the vertical tangents The surface or solid itself, somewhat 
truncated, is shown in fig, 29, p 166. 

6. Since, as we see fiom % 50, a normal surface for two 
correlated variables may be regarded merely as a certain surface 
for which r is zero turned round through some angle, and since 
for every angle through which it is turned the distributions of all 
x^ arrays and x^ arrays are normal, it follows that every section 
of a normal surface by a vertical plane is a normal curve, ^ e. the 
distributions of arrays taken at any angle across the surface are 
normal It also follows that, since the total distributions of x-^ 
and must be normal for every angle though which the surface 
is turned, the distnhutions of totals given by slices or arrays 
taken at any angle across a normal surface must be normal 
distributions. But these would give the distributions of functions 
like ax^±h.x^^ and consequently (1) the distribution of any 
linear function of two normally distributed variables and x^ 
must also be normal, (2) the coi relation between any two linear 
functions of two normally distributed variables must be norma) 
correlation 

To find the angle 6 through which the surface has been turned, 
from the position for which the correlation is zero to the position 
for which the coefiBcient has some assigned value r, we must use 
a little tngonometry The major and minor axes of the ellipses 
are sometimes termed the principal axes If 
ordinates referred to the principal axes (the f^-axis being the 
x-^ axis in its new position) we have for the relation between 
^ 2 , angle 0 being taken as positive for a rotation of 

the iTj-axis which will make it, if continued through 90"*, coincide 
in direction and sense with the 

cos &-hx^ sin 0) 

^2 = ^2 O-Xj sin 6 } * * ^ 

But, since uncorrelated, S(fj^ 2 ) = 0. Hence, multiplying 

together equations (8) and summing, 

0 = (erf - erf) sm 20 + 2^12 ctiXq cos 20 

. ( 9 ) 


21 



322 


THEORY OF STATISTICS. 


It should be noticed that if we define the principal axes of any 
distribution for two variables as being a pair of axes at right 
angles for which the variables uncorrelated, equation 

(9) gives the angle that they make with the axes of measurement 
whether the distribution be normal or no. 

7 The two standard-deviations, say ^2^ and Sg, about the 
principal axes are of some interest, for evidently from § 2 the 
major and minor axes of the contour-ellipses are proportional 
to these two standard-deviations. They may be most readily 
determined as follows Squaring the two transformation equations 
(8), summing and adding, we have 

+ + ( 10 ) 

Referring the surface to the axes of measurement, we have for 
the central ordinate by equation (7) 

. Jsr 

Referring it to the principal axes, by equation (3) 

But these two values of the cential ordinate must be equal, 
therefore 

= ( 11 ) 

(10) and (11) are a pair of simultaneous equations from which 

Sj and Sg may be veiy simply obtained in any arithmetical case. 
Care must, however, be taken to give the correct signs to the 
square root m solving 2^ 2^ is necessarily positive, and 2^ - 2^ 

also if r is positive, the major axes of the ellipses lying along 
but if r be negative, 2^ - 2,^ is also negative It should be noted 
that, while we have deduced (11) from a simple consideration 
depending on the normality of the distribution, it is really of 
general application (like equation 10), and may be obtained at 
somewhat greater length from the equations for transforming 
co-ordinates 

8 As stated in Chap XV § 13, the frequency-distribution 
for any vaiiable may be expected to be approximately normal 
if that variable may be regarded as the sum (or, within limits, 
some slightly more complex function) of a large number of other 
variables, piovided that these elementary component variables 
are independent, or nearly so Similarly, the correlation between 
two variables may be expected to be approximately normal if 



XVI — NORMAL CORRELATION. 


323 


each of the t\\ o variables may be regarded as the sum, or some 
slightly more complex function, of a large number of elementary 
component variables, the intensity of correlation depending on 
the proportion of the components common to the two variables 

Stature is a highly compound character of this kind, and we 
have seen that, in one instance at least, the distribution of stature 
for a number of adults is given approximately by the normal 
curve. We can now utilise Table III , Chap IX , p 160, showing 
the correlation between stature of father and son, to test, as far 
as we can by elementary methods, whether the normal surface 
will fit the distribution of the same character in pairs of indi- 
viduals we leave it to the student to test, as far as he can do so 
by simple graphical methods, the approximate normality of the 
total distributions for this table. The first important property 
of the normal distribution is the linearity of the regression 
This was well illustrated in fig 37, p 174, and the closeness of 
the regression to lineanty was confirmed by the values of 
the correlation-ratios (p- 206), viz , 0 52 in each case as com- 
pared with a correlation of 0 51. Subject to some investiga- 
tion as to the possibility of the deviations that do occur 
arising as fluctuations of simple sampling, when drawing 
samples from a record for which the regression is strictly 
linear, we may conclude that the regression is appreciably 
linear. 

9 The second important property of the normal distribution 
for two variables is the constancy of the standard-deviation for 
all parallel arrays We gave in Chap X. p 204 the standard- 
deviations of ten of the columns of the present table, from the 
column headed 62 5-63 5 onwards , these were — 


2 56 

2 

60 

2T1 

2 

26 

2 55 

2 

26 

2 24 

2 

45 

2 23 

2 

33 


the mean being 2 36 The standard-deviations again only fluctuate 
irregulaily lound their mean value The mean of the first five 
is 2 34, of the second five 2 38, a difference of only 0 04 of the 
first group, two are greater and three are less than the mean, 
and the same is true of the second group There does not seem 
to be any indication of a general tendency for the standard- 
deviation to increase or decrease as we pass from one end of the 
table to the other We are not yet in a position to test how 
far the differences from the average standard deviation might 
arise m sampling from a record m which the distribution was 



324 


THEORY OF STATISTICS. 


strictly normal, but, as a fact, a rough test suggests that they 
might have done so. 

10 Next we note that the distributions of all arrays of a 
normal surface should themselves be normal Owing, however, 
to the small numbers of observations m any array, the distributions 
of arrays are very irregular, and their normality cannot be tested 
in any very satisfactory way * we can only say that they do not 
exhibit any marked or regular asymmetry But we can test the 
allied property of a normal correlation-table, viz. that the totals 
of arrays must give a normal distribution even if the arrays be 
taken diagonally across the surface, and not parallel to either 
axis of measurement {ef, § 6) From an ordinary correlation- 
table we cannot find the totals of such diagonal arrays exactly, 
but the totals of arrays at an angle of 45*" will be given with 
sufficient accuracy for our present purpose by the totals of Imes 
of diagonally adjacent compartments Referring again to Table 
III , Chap. IX , and forming the totals of such diagonals (running 
up from left to right), we find, starting at the top left-hand 
corner of the table, the following distribution — 


0 25 

78*75 

2 

81 25 

3 25 

66*5 

6 25 

59*25 

8 

42 25 

9 75 

30 75 

17 

29 25 

34 5 

19 

42 

10 75 

46 25 

7 

60 5 

4 25 

67 5 

35 

85 75 

1*75 

87 25 

1 

78 

94 25 

0 25 


Total 1078 

The mean of this distribution is at 0 359 of an interval above the 
centre of the interval with frequency 78 its standaid-deviation 
is 4 757 intervals, or, remembering that the interval is 1/^/2 of 
an inch, 3 364 inches (This value may be checked directly from 
the constants for the table given m Chap IX , Question 3, p. 189, 
for we have from the first of the transformation equations (8), 

cr|=s erf cos^ ^ + (r| sm^ 0 -h ^ cos Oy 



XVI. — NOKMAL CORKELATION. 


325 


and inserting 72, 0-2 = 2 75, ^==cos 6^1/ *J2 

find cr| = 3 361) Diawing a diagram and fitting a normal 
curve ve have fig. 51 , the distribution is rather iiregulai but the 
fit is fair, certainly there is no marked asymmetry, and, so far as 
the graphical test goes, the distribution may be regarded as 
appreciably noimal. One of the greatest divergences of the 
actual distribution from the normal curve occurs in the almost 
central interval with fiequency 78 the difference between the 
observed and calculated frequencies is here 12 'units, but the 
stand aid error is 9*1, so that it may well have occurred as a 
fluctuation of simple sampling. 



Fig. 61. — Distribution of Frequency obtained by addition of Table IIL, 
Chap IX. 5 along Diagonals running up from left to nght, titted with a 
Normal Cuive, 


11 So far, we have seen (1) that the regression is approxi- 
mately linear ; (2) that, m the arrays which we have tested, the 
standard-deviations are approximately constant, or at least that 
their differences are only small, iiregular and fluctuating , (3) that 
the distribution of totals for one set of diagonal arrays is approxi- 
mately normal These results suggest, though they cannot 
completely prove, that the whole distribution of frequency may 
be regarded as approximately normal, within the limits of fluctu- 
ations of sampling We may therefore apply a more searching 
test, VIZ the form of the contour lines and the closeness of their 
fit to the contour-ellipses of the normal surface We can see at 
once, however, that no very close fit can be expected Since the 
frequencies m the compartments of the table are small, the 
standard enor of any fiequency is given approximately by its 


326 


THEORY OF STATISTICS 


square root (Chap XIII. § 12), and this implies a standard error 
of about 5 milts at the centre of the table, 3 units for a frequency 
of 9, or 2 units for a frequency of 4 such fluctuations might 
cause wide divergences m the corresponding contour lines 

Using the suffix 1 to denote the constants relating to the 
distribution of statuie foi fathers, and 2 the same constants for 
the sons, 

iY-1078 i/i-67 70 jI/ 2-6866 

0-1= 2 72 0*2= 2 75 

Hence we have fiom equation (7) 
and the complete expression for the fitted normal surface it? 

56o'6**3y‘ 

The equation to any contour ellipse will be given by equating 
the index of «5 to a constant, but it is very much easier to draw 
the ellipses if we refer them to their principal axes To do this 
we must first determine 2^ and '2^. From (9), 

tan 20= “46 49, 

whence 2^ — 91° 14', ^ = 45“ 37', the principal axes standing very 
nearly at an angle of 45“ with the axes of measuiement, 
owing to the two standard-deviations being very nearly equal 
They should be set off on the diagram, not with a protractor, but 
by taking tan 6 from the tables (1 022) and calculating points on 
each axis on either side of the mean 

To obtain 2^ and 2^ we have from (10) and (11) 

2? + 21=14961 
22 ^ 22 = 12 868 

Adding and subtracting these equations from each other and 
taking the square root, 

2iH.22 = 5-276 
2^-2^ = l 447 

whence 2^ = 3 36, 22 = 1*91, owing to the principal axes stand- 
ing nearly at 45“ the first value is sensibly the same as that found 
for O'! m § 10 The equations to the contour ellipses, referred to 
the principal axes, may therefore be written m the form 

fi 



XYI. — NORMAL CORRELATION. 


327 


the ma 3 or and minor axes being 3 36 x c and 1 91 x c respectively. 
To find c for any assigned value of the frequency y we have 

^9 2(log - log yjg) 
log e 


Supposing that we desire to diaw the three contoui-ellipses for 
y = 5, 10 and 20, we find c== 1 83, 1 40 and 0 76, or the following 



Fig 52 —Contour Lines for the Frequencies 5, 10 and 20 of the distribution 
of Table III , Chap IX , and coirespondmg Contour Ellipses of the fitted 
Noimal Suifdce principal axes it/, mean 


values for the major and minor axes of the ellipses * — semi-major 
axes, 6 15, 4 70, 2 55 semi-mmoi axes, 3*50, 2 67, 1 45 The 
ellipses drawn with these axes are shown in fig. 52, very mucb 


328 


THEORY OF STATISTICS. 


reduced, of course, from the original drawing, one of the squares 
shown representing a square inch on the original. The actual 
contour lines for the same frequencies aie shown by the irregular 
polygons superposed on the ellipses, the points on these polygons 
having been obtained by simple graphical interpolation between 
the frequencies in each row and each column — diagonal interpola- 
tion between the frequencies in a row and the frequencies in a 
column not being used It will be seen that the fit of the two 
lower contours is, on the whole, fair, especially considering the 
high standard errors In the case of the central contour, y == 20, 
the fit looks very poor to the eye, but if the ellipse be compared 
carefully with the table, the figures suggest that here again we 
have only to deal with the effects of fluctuations of sampling 
For father’s stature = 66 in, son’s stature = 70 in., there is 
a frequency of 18*75, and an increase m this much less than the 
standard error would bring the actual contour outside the ellipse. 
Again, for father’s stature = 68 m , son’s stature =71 in, there 
18 a frequency of 19, and an increase of a single unit would give 
a point on the actual contour below the ellipse. Taking the 
results as a whole, the fit must be regarded as quite as good as 
we could expect with such small frequencies It is perhaps of 
historical interest to note that Sir Francis Galton, working with- 
out a knowledge of the theory of normal correlation, suggested 
that the contour lines of a similar table for the inheritance of 
stature seemed to be closely represented by a series of concentric 
and similar ellipses (lef 2) the suggestion was confirmed when 
he handed the problem, in abstract terms, to a mathematician, 
Mr J D Hamilton Dickson (ref 4), asking him to investigate 
‘‘the Suiface of Frequency of Error that would result from 
these data, and the various shapes and other particulars of its 
sections that were made by horizontal planes” (ref. 3, p 102) 

12 The normal distribution of frequency for two variables is 
an isotropic distribution, to which all the theorems of Chap V 
§§ 11--12 apply For if we isolate the four compartments of the 
correlation-table common to the rows and columns centring 
round values of the variables x^^ Xi^ x^^ we have for the ratio 
of the cross-products (frequency of x^ x^ multiplied by frequency 
of x^ divided by frequency of x^ multiplied by frequency of 

Xi X^f 


♦'12 


-(*i-ari)(4"a;2). 


Assuming that x[ - x^ has been taken of the same sign as rrj* - iCg, 
the exponent is of the same sign as Hence the association for 



XVI — NOEMAL COERELATION. 


329 


this group of four frequencies is also of the same sign as ^^ 2 , the 
ratio of the cross-products being umtj, or the association zero, 
if IS zero In a normal distiibution, the association is therefore 
of the same sign — the sign of — for every tetrad of frequencies 
in the compartments common to two rows and two columns ^ that 
is to say, the distribution is isotropic It follows that eveiy 
grouping of a normal distribution is isotropic whether the class- 
intervals are equal or unequal, laige or small, and the sign of the 
association for a normal distribution grouped down to 2- x 2-fold 
form must always be the same whatever the axes of division 
chosen. 

These theorems are of importance in the applications of the 
theory of normal correlation to the treatment of qualitative 
characters which are subjected to a manifold classification The 
contingency tables for such characters are sometimes regarded as 
groupings of a normal distribution of frequency, and the coefficient 
of correlation is determined on this hypothesis by a rather lengthy 
procedure (ref. 14) Before applying this procedure it is well, 
therefore, to see whether the distribution of frequency may be 
regarded as approximately isotropic, or reducible to isotiopic form 
by some alteration m the order of rows and columns (Chap V 
§§ 9-10) If only reducible to isotropic form by some rearrange- 
ment, this rearrangement should be effected before grouping the 
table to 2- X 2-fold form for the calculation of the correlation 
coefficient by the process referred to If the table is not reducible 
to isotropic form by any rearrangement, the process of calculating 
the coefficient of correlation on the assumption of normality is to 
be avoided. Clearly, even if the table be isotropic it need not be 
normal, but at least the test for isotropy affords a rapid and 
simple means for excluding certain distributions which aie not 
even remotely normal Table II of Chap V might possibly be 
regarded as a grouping of normally distributed frequency if re- 
arranged as suggested in § 10 of the same chapter — it would be 
worth the investigator’s while to proceed furtbei and compare 
the actual distribution with a fitted normal distribution — but 
Table IV could not be regarded as normal, and could not be 
rearranged so as to give a grouping of normally distributed 
frequency. 

13 If the frequencies in a contingency-table be not large, and 
also if the contingency or correlation be small, the influence 
of casual irregularities due to fluctuations of sampling may 
render it difficult to say whether the distribution may be regarded 
as essentially isotropic or no In such cases some fuither con- 
densation of the table by grouping together adjacent row^s and 
columns, or some process of “smoothing” by averaging the 



330 


THEORY OF STATISTICS. 


frequencies in adjacent compartments, may be of service The 
correlation4able for stature in tather and son (Table III , Chap 
IX ), for instance, is obviously not strictly isotropic as it stands 
we have seen, however, that it appears to be normal, within the 
limits of fluctuations of sampling, and it should consequently be 
isotropic within such limits We can apply a rough test by 
regrouping the table in a much coarser form, say with four rows 
and four columns the table below exhibits such a grouping, the 
limits of rows and of columns having been so fixed as to include 
not less than 200 observations in each array. 


Table I — (condensed from Table III, of Chapter IX ) 


Son’s Stature 
(inches). 

Father’s Stature (inches) 

Under 
65 5 

65 5-67 5 

67 5-69 5 

69 6 

and over 

Total 

Under 66 5 

97 6 

74 25 

34 75 

10 5 

217 

66 5-68 6 

76 5 

108 

85 

52 

321 5 

68 5-70 5 

33 25 

64 76 

95 

84 5 

277 5 

70 6 and over 

14 75 

32 5 

80 75 

134 

262 

Total 

222 

279 5 

295 5 

281 

1078 


Taking the ratio of the ficquency in col 1 to the sum of the 
frequencies in cols 1 and 2 for each successive row, and so on for 
the other pairs of columns, we find the following senes of ratios . 


Table II — Ratzo of Frequency in Column m to Frequency in Column m 
+ Fiequency in Column Table I. 


Row. 

Columns 

1 and 2 

2 and 3 

3 and 4 

1 

. 0 668 

0 681 

0 768 

2 

0 416 

0 560 

0 620 

3 

0 339 

0 405 

0 529 

4 

0 312 

0 287 

0 376 


These ratios decrease continuously as we pass from the top to the 
bottom of the table, and the distribution, as condensed, is therefore 





XTI. — NORMAL CORRELATION. 


331 


isotropic The student should form one oi t\YO other condensations 
of the original table to 3- x 3- or 4- x 4-fold form he will probably 
find them either isotropic, or diverging so slightly from isotiopy 
that an alteration of the frequencies, well within the margin of 
possible fluctuations of sampling, will render the distribution 
isotropic 

14 Before concluding this chapter we may note briefly some 
of the principal properties of the normal distribution of frequency 
for any number of variables, referring the student for pi oofs to 
the original memoirs Denoting the frequency of the combination 
of deviations 0 ?^, a?g, . , , by we must have 

in the notation of Chapter XII , if the uncorrelated deviations iPj, 
^ 2 i»^ 3 i 2 » completely independent (c/. § 3 of the present 

chapter), 


yi2 

where 


. = ^12 

i- 

o"! 




1 <72.1 fTg 12 






^nl 


(n-l l 

(n-1) 


( 12 ) 


(13) 


and y'i2 


^ ^ 

12 • • • (n-1) 


(14) 


The expression (13) for the exponent <^ may be reduced to a 
general form corresponding to that given for two variables, viz. — 




(n-l) 


^nl2 


2.^12.S 


<^123 


X1X2 

n<^2 13 n 


- 


•l)n,12 


(n~2) 


^ (n-l)l 


^n-l^n 

(n— 2)n^n 1 . {n- 


Several important results may be deduced directly from the form 
(13) for the exponent Clearly this might have been wiitten in 
a gieat variety of ways, commencing with any deviation of the 
first order, allotting any primary subscript to the second deviation 
(except the subscript of the first), and so on, just as m § 3 we 
arrived at precisely the same final form for the exponent whether 
we started with the two deviations and iCg i or with x^ and x^ ^ 
Our assumption, then, that the deviations x^i, ^tc are 
normally distributed amounts to the assumption that all devia- 
tions of any order and with any suffixes are normally distributed, 
i.e. in the general normal distribution for n variables every array 
of every order is a normal distribution It will also follow, gen- 
eralising the deduction of § 6, that any linear function of x-^^ 

. . . . IS normally distributed Fuither, if in (13) any fixed 



THEOEY OF STATISTICS 


332 


values be assigned to ajg following deviations, the 

correlation between and aig, on expanding x^u is, as we have 
seen, normal correlation Similaily, if any fixed values be 
assigned to x^, to following deviations, on 

reducing 073,^2 second order we shall find that the correla- 

tion between x^i and x^-^^ is normal correlation, the correlation 
coejfiicient being 7*23 1, and so on That is to say, using k to 
denote any group of secondary suffixes, (1) the eorrelation between 
any two deviations x^^j^and x^j^is normal correlation ; (2) the correla- 
tion between the said deviations is whatever the particular 
fixed values assigned to the remaining deviations The latter 
conclusion, it will be seen, renders the meaning of partial 
correlation coefficients much more definite in the case of normal 
correlation than in the general case. In the general case r^nnh 
represents merely the average correlation, so to speak, between 
Xj^^ and Xn,it in the normal case r^^j, is constant for all the sub- 
groups corresponding to particular assigned values of the other 
variables. Thus in the case of three variables which are normally 
correlated, if we assign any given value to a7g, the correlation 
between the associated values of x-^ and x^ is r^^ 3 in the general 
case ri2 3} if actually worked out for the various sub-groups 
corresponding, say, to increasing values of x^^ would probably 
exhibit some continuous change, increasing or decreasing as the 
case might be Finally, we have to note that if, m the expression 
(15) for <^, we assign fixed values, say Ag, ^3, etc , to all the 
deviations except and then throw cj> into the form of a perfect 
square (as m § 4 for the case of two vaiiables), we obtain a normal 
distnbution for x^ m which the mean is displaced by 


^ias4 


0^123 

•” 0 - 2.13 


^ + ^13 24 
n 


O'! 23 
n — 
0^312 


Ag -f ... ri„,2 

n 




0"rt.l2 , (n~l) 


But this IS a linear function of ^3? , therefore m the case of 

normal correlation the regression of any one variable on any or all 
of the otheis is strictly linear The expressions ri 2 u » 
0*123. . «/o*2.i3 . m course the partial regressions 

^13 34 .... n> 


REFERENCES. 

General 

(1) Beavais, a , ‘‘Analyse math^raatiqiie sur les probabilit^s des erreura de 

situation d’un point,” Acad des Bcunces • M4moires presenUs par divers 
savants, II® s4iie, ix., 1846, p. 255 

(2) Galton, Francis, “ Family Likeness in Stature,” Free. Boy Soc,, vol xl , 

1886, p. 42. 

* (8) Galton, Francis, Natural Inheritance , Macmillan & Co , 1889. 



XYI. — NORMAL CORRELATION, 


333 


(4) Dickson, J. D Hamilton, Appendix to (2), Froc. Boy, Soc , vol. xl , 

1886, p. 63 

(5) Edgeworth, F Y., “On Correlated Averages,” PAiZ Mag ^ 5th Series, 

vol xxxiv , 1892, p 190 

(6) Pearson, Karl, “Regression, Heredity, and Panmma,” T^ram, 

Boy Soc , Series A, vol. clxxxvii , 1896, p 253 

(7) Pearson, Karl, “ On Lines and Planes of Closest Fit to Systems of Points 

in Space,” PM i/agr , 6th Senes, vol in, 1901, p 559 (On the fitting 
of “ principal axes” and the corresponding planes m the case of more 
than two variables ) 

(8) Pearson, Karl, ‘ ‘ On the Inflnence of Natural Selection on the Yaiiahility 

and Correlation of Organs,” Bktl Trans Boy Soc , Senes A, vol. cc , 
1902, p 1 (Based on the assumption of normal con elation.) 

(9) Pearson, Karl, and Alice IjEE, “On the Generalised Piobable Error in 

Multiple Normal Correlation,” JBiometrika, vol vi , 1908, p. 59 

(10) Yttle, G U , “On the Theory of Correlation,” Jour, Roy Stat. Soc ^ 

vol. lx , 1897, p. 812 

(11) Yule, G U , “On the Theory of Correlation for any number of Variables 

treated by a New System of Notation,” Broc. Boy, Soc,, Senes A, vol 
Ixxix , 1907, p 182 

(12) Sheppard, W F , “ On the Application of the Theoiy of Error to Cases 

of Normal Distribution and Normal Correlation,” Phil, Ttavs Roy 
Soc , Series A, vol cxcu., 1898, p. 101. 

(13) Sheppard, W F., “ On the Calculation of the Double ’integral express- 

ing Normal Correlation,” Cambridge Phil Trans , vol xix , 1900, p 23 

Applications to the Theory of Attributes, etc. 

(14) Pearson, Karl, “On the Correlation of Characters not Quantitatively 

Measurable,” PM Trans Roy /S'oc , Series A, vol. cxcv., 1900, p 1. 
{Cf criticism in ref 3 of Chap III.) 

(16) Pearson, Karl, “ On a New Method of Detei mining Con elation between 
a Measured Character A and a Chaiacter.P, of which only the Percent- 
age of Cases wherein JB exceeds (or falls short of) a given Intensity is 
recorded for each grade of A” Biomeirika, vol vii , 1909, p 96 

(16) Pearson, Karl, “On a New Method of Determining Coiielatxon, when 

one Variable is given by Alternative and the other by Multiple 
Categoiies,” Biometrika, vol vii , 1910, p. 248. 

See also the memoir (12) by Sheppard 

Various Methods and their Eelation to Normal Correlation 

(17) Pearson, Karl, “On the Theory of Contingency and its Relation to 

Association and Normal Correlation,” Drapers'* Company Research 
Memoirs, Biometric Series I, , Dulan & Co , London, 1904 

(18) Pearson, Karl, “On Further Methods of Determining Coirelation,” 

D? apers"* Company Research Memoirs, Biometric Sei les IV (Methods 
based on con elation of ranks* difierence methods) Dulan & Co., 
London, 1907 

(19) Spearman, C , “ A Footrule for Measuiing Correlation,” Brit, Jour, of 

Psychology, vol u , 1906, p. 89 (The suggestion of a “ rank ” method • 
see Pearson’s criticism and unproved formula m (18) and Spearman’s 
reply on some points in (20).) 

(20) Spearman, 0, “Correlation calculated from Faulty Data,” Pni Jour, 

of Psychology, vol. m , 1910, p 271 

(21) Thorndike, E L., “ Empirical Studies in the Theory of Measurement,” 

Archives of Psychology (New Yoik), 1907. 



334 


THEORY OR STATISTICS. 


EXERCISES. 

1 Deduce equation (11) from tlie equations for transformation of co-ordinates 
without assuming the normal distribution. (A pi oof will be found in ref. 10.) 

2 Hence show that if the pairs of observed values of and are repre- 
sented by pomts on a plane, and a straight line diawn through the mean, the 
sum of the squares of the distances of the points from this line is a minimum 
if the Ime is the major principal axis 

3. The coefficient of correlation with reference to the principal axes being 
zero, and with reference to other axes somethingt there must be some pair of 
axes at right angles for which the correlation is a maximum, is numerically 
greatest without regard to sign Show that these axes make an angle of 45® 
with the principal axes, and that the maximum value ot the coi relation is — 

si+ti 

i, (Sheppard, ref 12 ) A fourfold table is formed fiom a noimal coriela- 
tion table, taking the points of division between A and a, B and J3, at the 
medians, so that (J) = (a) = (5) = (j3) = N/2 Show that 



CHAPTER XYIL 

THE SIMPLER CASES OE SAMPLINC POR VARIABLES : 
PERGEHTILES AND MEAN 

1-2 The problem of sampling for variables, the conditions assumed— 
3 Standard erior of a percentile — 4 Special values for the percentiles 
of a normal distribution — 5 Effect of the form of the distiibution 
generally — d Simplified formula for the case of a grouped fiequency- 
distribution — 7 Correlation between errors in two peicentiles of the 
same distribution — 8 Standard error of the interquartile range for the 
normal curve — 9 Effect of removing the i esti ictions of simple sampling, 
and limitations of interpretation — 10 Standaid erior of the arithmetic 
mean — 11 Relative stability of mean and median in sampling — 12 
Standaid erior of the difference between two means — 13. The tendency 
to normality of a distribution of means— 14 Effect of lemoving the re- 
strictions of simple sampling — 15 Statement of the standard errors of 
standard deviation, coefficient of variation, correlation coefficient and 
regression, correlation-ratio and critei ion for linearity of regression — 16 
Restatement of the limitations of intei pietation if the sample be small 

1 In Chapters XIII -XVI we have been concerned solely with 
the theory of sampling for the case of attributes and the frequency- 
distributions appropriate to that case We now proceed to 
consider some of the simpler theorems foi the case of variables 
(c/ Chap. XIII § 2) Suppose that we have a bag containing a 
practically infinite number of tickets or cards bearing the recorded 
values of some variable X, and that we draw a ticket from this 
bag, note the value that it bears, draw another, and so on until 
we have drawn n caids (a number small compared with the whole 
number in the bag) Let us continue this piocess until we have 
iY such samples of n caids each, and then work out the mean, 
standard-deviation, median, etc , for each of the samples No one 
of these measures will piove to be absolutely the same for every 
sample, and our problem is to determine the standard-deviation 
that each such measure will exhibit 

2 In solving this problem, we must be careful to define 
precisely the conditions which are assumed to subsist, so as to 
realise the limitations of any solution obtained. These conditions 

335 



zm 


THEORY OF STATISTICS. 


were discussed very fully for the case of attributes (Chap. XII I 
§ 8), and we would refer the student to the discussion then given 
Here it is sufficient to state the assumptions briefly, using the 
letters (a), (b) and (c) to denote the corresponding assumptions 
indicated by the same letters in the section cited 

(a) We assume that we are drawing from precisely the same 
record throughout the experiment, so that the chance of drawing 
a card with any given value of X, or a value within any assigned 
limits, IS the same at each sampling 

(^>) We assume not only that we are drawing from the same 
record thioughout, but that each of our card& at each drawing 
may be regarded quite strictly as drawn from the same record (or 
from identically similar records) eg if our card-iecord is con- 
tained m a series of bundles, we must not make it a practice to 
take the first card from bundle number 1, the second card from 
bundle number 2, and so on, or else the chance of drawing a 
card with a given value of X, or a value within assigned limits, 
may not be the same for each individual card at each drawing. 

(c) We assume that the drawmg of each card is entirely 
vidependent of that of every other, so that the value of X recorded 
jn card 1, at each drawing, is uncorrelated with the value of X 
recorded on card 2, 3, 4, and so on It is for this reason that we 
spoke of the record, m § 1, as containing a practically infinite 
number of cards, for otherwise the successive drawings at each 
sampling would not be independent if the bag contain ten 
tickets only, bearing the numbers 1 to 10, and we draw the card 
bearing 1, the average of the following cards drawn will be higher 
than the mean of all cards drawn , if, on the other hand, we draw 
the 10, the average of the following cards will be lower than the mean 
of all cards — % e there will be a negative correlation between the 
number on the card taken at any one drawing and the card taken 
at any other drawing Without making the number of cards in 
the bag indefinitely large, we can, as already pointed out for the 
case of attiibutes (Chap XIII § 3), eliminate this correlation by 
replacing each card before drawing the next 

Sampling conducted under these conditions we shall, as before, 
speak of as simple sampling. We do not, it should be noticed, 
make the further assumption that the sample is unbiassed, le « 
that the chance of inclusion in the sample is independent of the 
value of X recorded on the card {cf the last paragraph in § 8, 
Chap XIII, and the discussion m §§ 4-8, Chap XIV.) This 
assumption is unnecessary If it be true, the interpretation of 
our results becomes simpler and more straightforward, for we 
can substitute for such phrases as “ the standard-deviation of X 
%n a very large sampU^^ ‘‘ the form of the frequency-distribution 



XVII. — SIMPLER CASES OF SAMPLING FOR VARIABLES. 337 


in a very large mmple^^ the phrases “ the standard-deviation of 
X %n the original record,^'* “the form of the frequency-distribution 
m the original record but m very many, perhaps the majority 
of, practical cases the very question at issue is the nature of the 
relation between the distribution of the sample and the distribu- 
tion of the record from which it is drawn. As has already been 
emphasised m the passages to which reference is made above,, no 
examination of samples diawn under the same conditions can 
give any evidence on this head. 

3. Standard Eiror of a Percentile — Let us consider first the 
fluctuations of sampling for a given percentile, as the problem is 
intimately related to that of Chaps XIII.-XIV. 

Let Xp be a value of X such that pX of the values of X in 
an indefinitely large sample drawn under the same conditions lie 
above it and qX below it. 

If we note the proportions of observations above Xp in samples 
of n drawn from the record, we know that these observed values 
wi ll ten d to centre round p as mean, with a standard-deviation 
Jpqjn. If now at each drawing, as well as observing the pro- 
portion of X^ above say ^ + 3, for the sample, we also proceed 
to note the adjustment c required m Xp to make the propoition 
of observations above X^-he in the sample the standard- 
deviation of € will bear to the standard-deviation of 3 the same 
ratio that c on an average bears to 3. But this ratio is quite 
simply determinable if the number of observations m the sample 
is sufficiently large to justify us m assunSing that 3 is small — so 
small that we may regard the element of the frequency curve 
(for a very large sample) over which -Zp c ranges as approximately 
a rectangle If this assumption be made, and we denote the 
standard-deviation of X in a very large sample by or, and the 
ordinate of the frequency curve at Xp when drawn with unit area 
and unit standard-deviation by pp^ 



Therefore for the standard-deviation of c or of the percentile 
conespondmg to a proportion^ we have 



4. If the frequency-distribution for the very large sample be a 
normal cuive, the values of Pp for the principal percentiles may be 
taken from the published tables. A table calculated by Mr 
Sheppard (Table III , p. 9, in Tables for Statisticians and Biomet- 

22 



338 


THIORY OF STATISTICS. 


ricians, or Table IV., ref. 16, in Appendix I ) gives the values 
directly, and these have been utilised for the following . the 
student can estimate the values roughly by a combined use of the 
area and ordinate tables for the normal curve given m Chapter 
XV., remembering to divide the ordinates given in that table by 
so as to make the area unity — Value of 


Median . 
Deciles 4 and 6 

„ 3 and 7 

„ 2 and 8 

,, 1 and d 

Quartiles 


. 0*3989423 
. 0 3863425 
. 0 3476926 
. 0*2799619 
. 0 1754983 
0 3177766 


Inserting these values of Pp in equation (1), we have the 
following values for the standard errors of the median, deciles, 
etc., and the values given in the second column for their probable 
errors (Chap XV § 17), which the student may sometimes find 
useful . — 

Standard error is Probable error is 

cr/'s/n multiplied by c/Vti multiplied by 


Median 

. 1*25331 

Deciles 4 and 6 . 

1*26804 

„ 3 and 7 . 

. 1 31800 

„ 2 and 8 . 

1*42877 

„ 1 and 9 . 

1 70942 

Quartiles 

1 36263 


0 84535 
0 85528 

0 88897 
0*96369 

1 15298 
0 91908 


It will be seen that the influence of fluctuations of sampling on 
the several percentiles increases as we depart from the median 
the standard error of the quartiles is nearly one-tenth greater than 
that of the median, and the standard error of the first or mnth 
deciles more than one-third greater. 

5 Consider further the influence of the form of the frequency- 
distribution on the standard error of the median, as this is an 
important form of average. For a distribution with a given 
number of observations and a given standard-deviation the 
standard error varies inversely as Hence for a distribution in 
which is small, for example a U-shaped distribution like that 
of fig 18 or fig 19, the standard error of the median will be 
relatively high, and it will, in so far, be an undesiiable form of 
average to employ. On the other hand, m the case of a distribu- 
tion which has a high peak in the centre, so as to exhibit a value 
of yp large compared with the standard-deviation, the standard 
error of the median will be relatively low. We can create such Sf 



XYII.— SIMPLER CASES OF SAMPLING FOR YARIABLES. 339 


“peaked” distribution by superposing a noimal curve with a 
small standard-deviation on a normal curve with the same mean 
and a relatively large standard-deviation To give some idea of 
the reduction in the standard error of the median that may be 
effected by a moderate change in the form of the distribution, let 
us find for what ratio of the standard-deviations of two such curves, 
having the same area, the standard error of the median reduces to 
where o- is of course the standard-deviation of the com- 
pound distribution. 

Let oTj, oTg be the standard-deviations of the two distributions, 
and let there be 7i/2 observations in each. Then 



On the other hand, the value of is — 


(c) is equal to ajs/n if 


i ^ 1 l 

I 0-1 2 V27r.o-, /V 2 

of the mec 

x/- 

V n 


Hence the standard eiior of the median is 


I + 0^0 


{(Ti +0-2) y/cTi 4- 0-1 _ ^ 

2 JlTCTia^ 


Writing cr^lo-^^p, that is if 


or 


(1 +p) jj . I 

2 ^TTp 

+ 2p3 + (2 47r)p2 + 2p 4-1=0- 


(^) 


(c) 


This equation may be reduced to a quadiatic and solved by 

taking p-f — as a new variable The roots found give p = 2 2360 

. . . or 0 4472 . . the one root being merely the reciprocal of 
the other The standard error of the median will therefore be 


(r/Jn, in such a compound distribution, if the standard-deviation 
of the one normal curve is, m round numbers, about 2J times 
that of the other If the ratio be greater, the standard error 
gf the median will be less than a/Jn, The distribution 



340 


THEORY OF STATISTICS. 


for which the standard error of the median is exactly equal to 
(jrjjn IS shown in fig 53 • it will be seen that it is by no means 
a very striking form of distribution ; at a hasty glance it might 
almost be taken as normal. In the case of distributions of a form 
more or less similar to that shown, it is evident that we cannot 
at all safely estimate by eye alone the relative standard error of 
the median as compared with cr/fjn. 

6. In the case of a grouped frequency-distribution, if the 
number of observations is sufificient to make the class-frequencies 
run fairly smoothly, le. to enable us to regaid the distribution 



Fig. 63 


as nearly that of a very large sample, the standard error of any 
percentile can be calculated very readily indeed, for we can 
eliminate cr from equation (1). Let fp be the frequency-per- 
class-mterval at the given percentile— simple interpolation will 
give us the value with quite sufficient accuracy for practical 
purposes, and if the figures run irregularly they may be smoothed. 
Let cr be the value of the standard-deviation expressed in class- 
intervals, and let n be the number of observations as before. 
Then since is the ordinate of the frequency-distribution when 
diawn with unit standard-deviation and unit area, we must 
have 



XYIL — SIMPLER OASES OF SAMPLING FOR VARIABLES. 341 


But this gives at once for the standard error expres&ed in terna 
of the class-intervcd as unit 



■ ( 2 ) 


As an example m which we can compare the results given bj 
the two different formulae (1) and (2), take the distribution of 
statuie used as an illustration m Chaps. YII and VIII. and in 
§§ 13, 14 of Chap XV The number of observations is 8585, 
and the standard-deviation 2 57 in., the distribution being 
approximately normal crlJn^O 027737, and, multiplying by the 
factor 1 253 . given m the table in § 4, this gives 0 0348 

as the standard error of the median, on the assumption of 
normality of the distribution Usmg the direct method of 
equation (2), we find the median to be 67 47 (Chap. VII. § 15), 
which is very nearly at the centre of the interval with a 
frequency 1329 Taking this as being, with sufficient accuracy 
for our present purpose, the frequency per interval at the median, 
the standaid error is 


v/8585 

1329 


= 00349. 


As we should expect, the value is practically the same as that 
obtained from the value of the standard-deviation on the assump- 
tion of normality. 

Let us find the standard error of the first and ninth deciles 
as another illustration. On the assumption that the distribu- 
tion IS normal, these standard errors are the same, and equal to 
0 027737x 1 70942 — 0 0474. Using the direct method, we 
find by simple interpolation the approximate frequencies per 
interval at the first and ninth deciles respectively to be 590 and 
570, giving standard errors of 0-0471 and 0*0486, mean 0 0479, 
slightly m excess of that found on the assumption that the fre- 
quency IS given by the normal curve. The student should notice 
that the class -interval is, in this case, identical with the unit of 
measurement, and consequently the answer given by equation (2) 
does not require to be multiplied by the magnitude of the 
interval. 

In the case of the distribution of pauperism (Chap YII., 
Example i ), the fact that the class-interval is not a unit must 
be remembered The frequency at the median (3*195 per cent.) 
is approximately 96, and this gives for the standard error of the 
median by (2) (the number of observations being 632) 0 1309 
intervals, that is 0 0655 per cent 

7 In finding the standard error of the difference between two 



342 


THEOEY OF STATISTICS. 


percentiles in the same distribution, the student must be care- 
ful to note that the errois m two such percentiles are not 
independent Consider the two percentiles, for which the values 
of p and q are respectively, the first-named being the 

lower of the two percentiles These two percentiles divide the 
whole area of the frequency curve into three parts, the areas of 
which are proportional to 1 -q^-p^i and p^. Further, since 
the errors in the first percentile are directly proportional to the 
errors m q^^ and the errors m the second percentile are directly 
proportional but of opposite sign to the errors in p^^ the corre- 
lation between errors m the two percentiles will be the same as 
the correlation between errors in q^ and p^ but of opposite sign. 
But if there be a deficiency of observations below the lower 
percentile, producing an error Sj in the missing observations 
will tend to be spread over the two other sections of the curve 
in proportion to their respective areas, and will therefore tend to 
produce an error 



in p^ If then r be the correlation between errors in q^ and p^^ 
€-^ and €2 their respective standard errors, we have 


h Pi 


Or, inserting the values of the standard errors, 




The correlation between the percentiles is the same in magni- 
tude but opposite m sign : it is obviously positive, and consequently 


correlation between errors 
in two percentiles 




If the two percentiles approach very close together, and q^, 
p^ and P 2 become sensibly equal to one another, and the correla- 
tion becomes unity, as we should expect. 

8 Let us apply the above value of the correlation between 
percentiles to find the standard error of the semi-interquartile 
range for the normal curve Inserting q^ —p^ — -j-, q^ =Pi = |, we 
find r “ J Hence the standard error of the interquartile range 
is, applying the oidinary formula for the standard-deviation of a 
difference, 2/^S times the standard error of either quartile, or 



XYir — SIMPLER CASES OF SAMPLING FOR TARIABLES. 343 


the standard eiror of the sem^-interquartile range 1/^/^ times 
the standard eiior of a quaitile Taking the value of the 
standaid error of a quaitile from the table in § 4, we have, finally, 

standard error of the semi- j o- 

mterquaitile range in a V =0 78672'"7^ . . (4) 

normal distribution ) 

Of course the stand^rd-deviation of the inter-quaitile, or semi- 
interquaitile, range can readily be worked out m any particular 
case, using equation (2) and the value of the correlation 
given above it is best to work out such standard errors 
from first principles, applying the usual formula for the standard 
deviation of the difference of two correlated variables (Chap XI. 
§ 2, equation (1)) 

9 If there is any failure of the conditions of simple sampling, 
the formulae of the preceding sections cease, of course, to hold 
good We need not, however, enter again into a discussion of 
the effect of removing the several restrictions, for the effect on 
the standard error of jp was considered in detail in g§ 9-14 of 
Chap XIY , and the standard error of any peicentile is directly 
proportional to the standard error of^ {cf, § 3). Further, the 
student may be reminded that the standard error of any per- 
centile measures solely the fluctuations that may he expected m 
that percentile owing to the errors of simple sampling alone it 
has no bearing, therefore, save on the one question, whether an 
observed divergence of the percentile, from a certain value that 
might be expected to be yielded by a more extended series of 
observations or that had actually been observed in some other 
series, might or might not be due to fluctuations of sunple 
sampling alone It cannot and does not give any indication of 
the possibility of the sample being biassed or uniepresentative of 
the material from which it has been drawn, nor can it give any 
indication of the magnitude or influence of definite errors of 
observation — eirors which may conceivably be of greater im- 
portance than errors of sampling In the case of the distribution 
of statures, for instance, the standard error almost certainly gives 
quite a misleading idea as to the accuracy attained m determining 
the average stature for the United Kingdom the sample is not 
representative, the several parts of the kingdom not contributing 
in their true propoitions The student should refer again to the 
discussion of these points in §§ 4-8 of Chap XIY Finally, we 
may note that the standard error of a percentile cannot be 
evaluated unless the number of observations is fairly large — large 
enough to determine (eqn 2) with reasonable accuracy, or 



344 


THBOEY OF STATISTICS. 


to test whether we may treat the distribution as approximately 
normal {cf, also § 16 below) 

(As regards the theory of sampling for the median and per- 
centiles generally, cf ref. 15, Laplace, Supplement 11. (standard 
error of the median), Edgeworth, refs. 5, 6, 7, and Sheppard, ref 
27* the preceding sections have been based on the work of 
Edgeworth and Sheppard.) 

10 Standard Error of the Arithmetic Mean — Let us now pass 
to a fresh problem, and determine the standard error of the 
arithmetic mean. 

This is very readily obtained. Suppose we note separately at 
each drawing the value recorded on the first, second, third . . . 

and nth card of our sample The standard-deviation of the values 
on each separate card will tend in the long run to be the same, 
and identical with the standard-deviation cr of a* in an indefinitely 
large sample, drawn under the same conditions. Further, the 
value recorded on each card is (as we assume) uncorrelated with 
that on every other. The standard-deviation of the sum of the 
values recorded on the n cards is therefore N/ncr, and the 
standard-deviation of the mean of the sample is consequently 
I /nth of this, or. 



This IS a most important and frequently cited formula, and the 
student should note that it has been obtained without any 
refeience to the size of the sample or to the form of the frequency- 
distribution It IS therefore of perfectly general application, if 
cr be known We can verify it against our formula for the 
standard-deviation of sampling in the case of attributes The 
standard-deviation of the number of successes in a sample of m 
observations is Jm,pq the standard-deviation of the total 
numbe r of su ccesses m n samples of m observations each is there- 
fore Jnm.pq * dividing by n we have the standard-deviation of 
the mean number of successes m the n samples, viz. tjmpq /tjri, 
agreeing wfth equation (5) 

11. For a normal curve the standard error of the mean is to 
the standard error of the median approximately as 100 to 125 
(^f § ^)j ill general the standard errors of the two stand m 
a somewhat similar ratio for a distribution not differing largely 
from the normal form For the distribution of statures used as 
an illustration in § 6 the standard error of the median was found 
to be 0*0349 : the standard error of the mean is only 0 0277. 
The distribution being very approximately normal, the ratio of 



Xm— SIMPLER OASES OF SAMPLING FOR \ARIABLES. 345 

the two standard errors, viz 1 26, assumes almost ezactlj the theo- 
retical magnitude. In the case of the asymmetrical distribution of 
rates of pauperism, also used as an illustration in § 6, the standard 
error of the median was found to be 0 0655 per cent The 
standard error of the mean is only 0*0493 per cent., which bears 
to the standard error of the median a ratio of 1 to 1*33. As 
such cases as these seem on the whole to be the more common 
and typical, we stated in Chap. VII. § 18 that the mean is in 
general less affected than the median by errors of samphng. At 
the same time we also indicated the exceptional cases in which 
the median might be the more stable — cases m which the mean 
might, for example, be affected considerably by small groups of 
widely outlying observations, or in which the frequency-distii- 
bution assumed a form resembling fig 53, but even more 
exaggerated as regards the height of the central “peak” and the 
relative length of the “tails.” Such distributions are not un- 
common in some economic statistics, and they might be expected 
to charactense some forms of expeiimental error. If, in these 
cases, the greater stability of the median is sufficiently marked 
to outweigh its disadvantages in other respects, the median 
may be the better form of average to use. Fig 53 represents 
a distribution in which the standard errors of the mean and of the 
median are the same Fuither, in some experimental cases it is 
conceivable that the median may be less affected by definite 
experimental errors, the average of which does not tend to be 
zero, than is the mean, — this is, of course, a point quite distinct 
from that of errors of sampling. 

12 If two quite independent samples of and Tig observations 
respectively be drawn from a record, evidently standaid 

error of the difference of their means is given by 

••••(«) 

If an observed difference exceed three times the value of 
given by this formula it can hardly be ascribed to fluctuations 
of sampling If, m a practical case, the value of o* is not known 
a prion, we must substitute an observed value, and it would seem 
natural to take as this value the standard-deviation in the two 
samples thrown together If, however, the standard-deviations 
of the two samples themselves differ more than can be accounted 
for on the basis of fluctuations of sampling alone (see below, § 15), 
we evidently cannot assume that both samples have been drawn 
from the same record : the one sample must have been drawn 
from a record or a universe exhibiting a greater standard-deviation 



346 


THEOEY OF STATISTICS. 


than the other. If two samples be drawn quite independently 
from diffeient universes, mdefimteiy large samples from which 
exhibit the standard-deviations and o-g, the standard error of 
the difference of their means will be given by 


4 =-‘+ 


<^i , <4 


Tig 


. ( 7 ) 


This IS, indeed, the formula usually employed for testing the 
Significance of the difference between two means m any case 
seeing that the standard eiror of the mean depends on the 
standard-deviation only, and not on the mean, of the distribution, 
we can inquire whether the two universes from which samples 
have been drawn differ in mean apart from any difference in 
dispersion. 

If two quite independent samples be drawn from the same 
universe, but instead of comparing the mean of the one with the 
mean of the other we compare the mean of the first with the 
mean of both samples together, the use of (6) or (7) is not 
justified, for errors in the mean of the one sample are correlated 
with errors in the mean of the two together Following precisely 
the lines of the similar proble m m § 13, C hap XIII , case III , we 
find that this correlation is sjn^l{n^ + n^^ and hence 


^ ^ 

n^{ny + Tig) 


. ( 8 ) 


(For a complete treatment of this pioblein m the case of samples 
drawn from two different universes cf ref 22 ) 

13 The distribution of means of samples drawn under the 
conditions of simple sampling will always be more symmetrical 
than the distribution of the original record, and the symmetry 
will be the greater the greater the number of obseivations m the 
sample Further, the distribution of means (and therefore also of 
the differences between means) tends to become not merely sym- 
metrical but normal We can only illustrate, not prove, the 
point here ; but if the student will refer tog 13, Chap XV., he will 
see that the genesis of the normal curve in this case is in accord- 
ance with what we then stated, viz that the distribution tends to 
be normal whenever the variable may be regarded as the sum 
(or some slightly more complex function) of a number of other 
variables In the present instance this condition is strictly ful- 
filled The mean of the sample of n observations is the sum of 
the values m the sample each divided by ti, and we should expect 
the distribution to be the more nearly normal the larger n. As 
an illustration of the approach to symmetry even for small values 



XYII. — SIMPLER CASES OF SAMPLING FOR VARIABLES. 347 


of 71 , may take the following case If the student will turn to 
the calculated binomials, given as illustrations of the fornas of 
binomial distributions in Chap XV § 3, he will hnd there the 
distribution of the number of successes for twenty events when 
5 ' = 0 9, ^ = 01. the distribution is extremely skew, starting at 
zero, rising to high frequencies for 1 and 2 successes, and thence 
tailing off to 20 cases of 7 successes in 10,000 throws, 4 cases of 8 
successes and 1 case of 9 successes But now find the distribu- 
tion foi the mean number of successes in groups of five throws, 
under the same conditions. This will be equivalent to finding 
the distribution of the number of successes for 100 such events, 
and then dividing the observed number of successes by five — the 
last process making no difference to the form of the distribution, 
but only to its scale But the distribution of the number of 
successes for 100 events when ^^ = 0 9, jo = 01, is also given in 
Chap XV. § 3, and it will be seen that, while it is appreciably 
asymmetrical, the divergence from symmetry is comparatively 
small- the distribution has gained very greatly in symmetry 
though only five observations have been taken to the sample 
We may therefore reasonably assume, if our sample is large, 
that the distribution of means is approximately a normal dis- 
tribution, and we may calculate, on that assumption, the fre- 
quency with which any given deviation from a theoretical value 
or a value observed m some other series, in an observed mean, will 
arise from fluctuations of simple sampling alone. 

The warning is necessary, however, that the approach to 
normality is only rapid if the condition that the several drawings 
for each sample shall be independent is strictly fulfilled. If the 
observations are not independent, but are to some extent positively 
correlated with each other, even a fairly large sample may con- 
tinue to reflect any asymmetry existing in the oi iginal distribution 
(c/ ref. 32 and the record of sampling there cited) 

If the original distribution be normal, the distribution of 
means, even of small samples, is strictly noimal This follows at 
once from the fact that any linear function of normally distributed 
variables is itself normally distributed (Chap XVI § 6) The 
distribution will not m general, however, be normal if the 
deviation of the mean of each sample is expressed in terms of the 
standard-deviation of that sample {cf ref 30) 

14 Let us consider briefly the effect on the standard error of 
the mean if the conditions of simple sampling as laid down in 
§ 2 cease to apply 

{a) If we do not draw from the same record all the time, but 
first draw a series of samples from one record, then another 
senes from another record with a somewhat different mean and 



348 


THEORY OF STATISTICS 


standard-deviation, and so on, or if we draw the successive 
samples from essentially different parts of the same record, the 
standard error will be greatly increased For suppose we draw 
\ samples from the first record, for which the standard-deviation 
(m an indefinitely large sample) is o-j, and the mean differs by 
from the mean of all the records together (as ascei tamed by 
large samples in numbers proportionate to those now taken) , \ 
samples from the second record, for which the standard- deviation 
IS o-Q, and the mean differs by from the mean of all the records 
together, and so on Then for the samples drawn from the first 
record the standard error of the mean will be but the 

distribution will centre round a value differing by d^ from the 
mean for all the records together, and so on for the samples 
drawn from the other records Hence, if cr„v be the standard error 
of the mean, N the total number of samples, 


But the standard-deviation cr^ for all the records together is given 

if<r? = 2(A<7^) + 2(Ad"). 


Hence, writing ^{kd^) = 


n 


erf 71-1 


• ( 9 ) 


This equation corresponds precisely to equation (2) of § 9, Chap 
Xiy. The standard error of the mean, if our samples are drawn 
from different records or from essentially different parts of the 
entire record, may be increased indefinitely as compared with the 
value it would have in the case of simple sampling If, for 
example, we take the statures of samples of n men m a number 
of different districts of England, and the standard-deviation of all 
the statures observed is ctq, the standard-deviation of the means 
for the different districts will not be ctq/n/ti, but will have some 
greater value, dependent on the real variation in mean stature 
from district to district. 

(d) If we are drawing from the same record throughout, but 
always draw the first card from one part of that record, the 
second card from another part, and so on, and these parts differ 
more or less, the standard error of the mean will be decreased 
For if, m large samples drawn from the subsidiary parts of the 
record from which the several cards are taken, the standard- 
deviations are <r^, o-g, . . <r„, and the means differ by d^, d^ 



XVII. — SIMPLEE CASES OF SAMPLING POE VAKIABLES. 349 ' 


. ... from the mean for a large sample from the entire record, 
we have 

<r?=-2((r^) + -2(d2). 
n ^ ' 71 ^ ' 

Hence 




( 10 ) 


The last equation again coi responds precisely with that given for 
the same departure from the rules of simple sampling in the case 
of atti'ibutes (Chap. XIV § 11 , eqn. 4) If, to vary our previous 
illustration, we had measured the statures of men in each of ti 
different districts, and then proceeded to form a set of samples 
by taking one man from each district for the first sample, one 
man from each district for the second sample, and so on, the 
standard-deviation of the means of the samples so formed would 
be appreciably less than the standard error of simple samphng 
G-Jjn As a limiting case, it is evident that if the men m each 
district were all of precisely the same stature, the means of all the 
samples so compounded would be identical . in such a case, in fact, 
otq = and consequently cr^ = 0. To give another illustration, if 
the cards from which we were drawing samples had been arranged 
in order of the magnitude of X recorded on each, we would get 
a much more stable sample by drawing one card from each 
successive Tith part of the record than by taking the sample 
according to our previous rules — e g shaking them up in a bag 
and taking out cards blindfold, or using some equivalent process 

The result is perhaps of some practical interest It shows that, 
if we are actually taking samples from a large area, different 
districts of which exhibit markedly different means for the 
variable under consideration, and are limited to a sample of w 
observations, if we bieak up the whole area into n sub-districts, 
each as homogeneous as possible, and take a contribution to the 
sample from each, we will obtain a more stable mean by this 
orderly procedure than will be given, for the same number of 
observations, by any process of selecting the districts from which 
samples shall be taken by chance There may, however, he a 
greater risk of biassed error. The conclusions seem in accord 
with common-sense 

(c) Finally, suppose that, while our conditions (a) and {h) of § 2 
hold good, the magnitude of the variable recorded on one card 
drawn is no longer independent of the magmtude recorded on 



350 


THEORY OF STATISTICS. 


another card, t.g that if the first card drawn at any sampling 
bears a high value, the next and following cards of the same 
sample are likely to bear high values also Under these circum- 
stances, if denote the correlation between the values on the 
first and second cards, and so on, 

= - +2^(rio + J-i3+ • . • • +»'23+ ....)■ 

There are 71(71 — 1)/2 correlations; and if, therefore, r is the 
arithmetic mean of them all, we may write 

+»•(«- 1 )] • • • ( 11 ) 

As the means and standard-deviations of , , . . are all 

identical, r may more simply be regarded as the correlation 
coefficient for a table formed by taking all possible pairs of the 
n values in every sample. If this correlation be positive, the 
standard error of the mean will be increased, and for a given 
value of r the increase will be the greater, the greater the size of 
the samples. If r be negative, on the other hand, the standard 
error will be diminished Equation (11) corresponds precisely to 
equation (6), § 13, of Chap. XIV 

As was pointed out in that chapter, the case when r is positive 
covers the case discussed under (a) for if we draw successive 
samples from different records, such a positive correlation is at 
once introduced, although the drawings of the several cards at 
each sampling are quite independent of one another Similarly, 
the case discussed under {h) is covered by the case of negative 
coi relation, for if each card is always drawn from a separate and 
distinct part of the record, the correlation between any two x'% will 
on the aveiage be negative . if some one card be always drawn 
from a part of the record containing low values of the variable, 
the otheis must on an average be drawn from parts containing 
relatively high values It is as well, however, to keep the cases 
(a), (6), and (c) distinct, since a positive or negative correlation 
may arise for reasons quite different from those considered under 
(a) and (h) 

15 With this discussion of the standard error of the arithmetic 
mean we must bring the present work to a close To indicate 
briefly our reasons for not proceeding further with the discussion 
of standard errors, we must remind the student that in order to 
express the standard error of the mean we require to know, in 
addition to the mean itself, the standard-deviation about the mean, 
or, in other words, the mean (deviation)^ with respect to the mean. 



XVIL — SIMPLKK CASES OF SAMPLING FOE VAEIABLES. 351 


Similarly, to express the standard error of the standard-deviation 
we require to know, in the general case, the mean (deviation)^ 
with respect to the mean Either, then, we must find this quantity 
for the given distribution — and this would entail entering on a 
field of work which hitherto we have intentionally avoided — or we 
must, if that be possible, assume the distribution to be of such a 
form that we can express the mean (deviation)^ in terms of the 
mean (deviation)^. This can be done, as a fact, for the normal 
distribution, but the proof would again take us rather beyond 
the limits that we have set ourselves. To deal with the standard 
error of the correlation coefficient would take us still fuither 
afield, and the proof would be laborious and difficult, if not 
impossible, without the use of the differential and integral cal- 
culus We must content ourselves, therefore, with a simple 
statement of the standard errors of some of the more important 
constants 

Standard-deviation . — If the distribution be normal, 

standard error of the 1 
standard-deviation in > ~ 
a normal distribution ) 


^/ 2 ^ 


( 12 ; 


This IS generally given as the standaid eiror in all cases it is, 
however, by no means exact . the general expression is 

standard erior of the standard- j / 5 

deviation in a distiibution \ ^ k/ . (13) 

of any form ) ^ ^ 

where is the mean (deviation)^ — deviations being, of course, 
measured from the mean — ^and mean (deviation)^ or the 

square of the standard-deviation n is assumed sufficiently large 
to make the errors in the standard-deviation small compared with 
that quantity itself. Equation (13) may m some cases give 
values considerably greater — twice as great or more — than (12) 
(Cy ref 17 ) If, however, the distribution be noimal, equation 
(12) gives the standard error not merely of standard-deviations of 
order zero, to use the terminology of Chap XII , but of standard- 
deviations of any order (ref 33). It will be noticed, on reference 
to equation (4) above, § 8, that the standard error of the standard- 
deviation IS less than that of the semi-interquartile range for a 
normal distribution 

For a normal distribution, again, we have — 


standard error of the co- ) _ I 1 -h ^ Y 1 * 
efficient of variation J ” I “'\I00y j 


( 14 ) 



352 


OF STATISTICS 


The expression in the bracket is usually very nearly unity, for 
a normal distribution, and in that case may be neglected. 
Goiirelatton coefficient , the distribution be normal, 

standard error of the cor- j 

relation coefficient for > ~ — -p . . (15) 

a normal distribution ) 

This is the value always given : the use of a more general formula 
which would entail the use of higher moments does not appear 
to have been attempted As regards the case of small samples, 
cf, refs. 10, 28, and 31. Equation (15) gives the standard error 
of a coefficient of any order, total or partial (ref 33) For the 
stand aid error of the correlation-coefficient for a fourfold table 
(Chap XI , § 10), see ref 34 the formula (15) does not apply 
Coefficient of regression — If the distribution be normal, 

standard error of the co- ) j- 

efficient of regression ^16) 

for a normal distribution j o-g n/ti a^^Jn 


This formula again applies to a j egression coefficient of any oider, 
total or partial . le m terms of our general notation, k denoting 
any collection of secondary subscripts other than 1 or 2, 

standard error of for ) __ o‘i 2 k 
a normal distribution J Jn 

Correlation ratio — The general expression for the standaid 
error of the correlation-ratio is a somewhat complex expression 
{cf Professor Peai son’s original memoir on the correlation-ratio, 
ref 18, Chap X.). In general, however, it may be taken as 
given sufficiently closely by the above expression for the standaid 
error of the correlation coefficient, that is to say, 

standard error of correlation- ) __ ^ -rf 

ratio approximately ] Jn ‘ ‘ \ 0 

As was pointed out in Chap X., § 21, the value of ^ - r^ is a 

test for linearity of regression Very approximately (Blakeman, 
ref 1), __ 

standard error of ^=2^^ . (18) 


For rough work the value of the second square root may be 
taken as nearly unity, and we have then the simple expiession. 


standard error of 


t roughly = 2y/| 


(19) 



XVII — SIMPLER CASES OP SAMPLING FOR VARIABLES. 353 

To convert any standard error to the prohahle error mvltiply hy 
the constant 0*674489 .... 

16. We need hardly restate once more the warnings given m 
Chap XIV., and repeated in § 9 above, that a standard error can 
give no evidence as to the biassed or representative character of 
a sample, nor as to the magnitude of errors of observation, but 
we may, m conclusion, again emphasise the warnings given 
in §§ 1-3, Chap. XIY , as to the use of standard errors when 
the number of observations in the sample is small. 

In the first place, if the sample be small, we cannot in general 
assume that the distribution of errors is approximately normal : 
it would only be normal in the case of the median (for which 
p and q are equal) and m the case of the mean of a normal 
distribution Consequently, if n be small, the rule that a 
range of three times the standard error includes the majority 
of the fluctuations of simple sampling of either sign does not 
strictly apply, and the ‘‘probable error” becomes of doubtful 
significance. 

Secondly, it will be noted that the values of cr and in (1), of 
fp m (2), and of cr m (4) and (5), ^ d. the values that would be 
given for these constants by an indefinitely large sample drawn 
under the same conditions, or the values that they possess in 
the original record if the sample is unbiassed, are assumed to be 
known a priori. But this is only the case in dealing with the 
problems of artificial chance in practical cases we have to use 
the values given us by the sample itself. If this sample is based 
on a considerable number of observations, the procedure is safe 
enough, but if it be only a small sample we may possibly mis- 
estimate the standard error to a serious extent Following the 
procedure suggested in Chap XIY , some rough idea as to the 
possible extent of under-estimation or over-estimation may be 
obtained, e.g in the case of the mean, by first working out the 
standard error of cr on the assumption that the values for the 
necessary moments are correct, and then replacing cr in the 
expression for the standard error of the mean by cr ± three times 
its standard error so obtained. 

Finally, it will be remembered that unless the number of 
observations is large, we cannot interpret the standard error of 
any constant in the inverse sense, i,e the standard error ceases 
to measure with reasonable accuracy the standard-deviation of 
true values of the constant round the observed value (Chap 
XIY. § 3). If the sample be large, the direct and inverse 
standard errors are approximately the same. 


23 



354 


THEORY OF STATISTICS. 


REFERENCES. 

The probable errors of various special coefficients, etc , are generally dealt 
with in the memoirs concerning them, reference to which has been made in 
the lists of previous chapters . reference has also been made before to most of 
the memoirs concerning errois of sampling in proportions or peicentages 
The following is a classification of some of the memoirs in the list below — 
General . 18, 20. 

Theory of fit of two distributions . 9, 19, 23. 

Averages and percentiles 5, 6, 7, 30, 32, 35, 36 
Standard deviation : 17, 26. 

Coefficient of correlation (product sum and partial correlations) • 10, 
12, 13, 28, 31, 33, 34. 

Coefficient of correlation, other methods, normal coefficient, eta • 24, 29. 
Coefficients of association 34 
Coefficient of contingency . 2, 25 

As regards the conditions under which it becomes valid to assume that the 
distribution of errors is normal, cf ref. 14. 

(1) Bl^^keman, J , ^*On Tests for Linearity of Regiession in Frequency 

Distributions,” vol iv., 1905, p 332. 

(2) Blakeman, J , and Karl Pearson, ‘^On the Probable Error of the 

Coefficient of Mean Square Contingency,” Biometrika^ vol. v., 1906, 
p 191 

(3) Bowlet, a L , The Measurement of Groups and Series , C & E Layton, 

London, 1903 

(4) Bowley, a L , Address to Section F of the British Association, 1906 

(5) Edg-eworth, F Y., “ Obseivations and Statistics An Essay on the 

Theory of Errois of Obsei vation and the Fust Principles of Statistics,” 
Cambridge Phil Trans , vol xiv , 1885, p 139 

(6) Edgeworth, F Y , “ Problems in Probabilities,” Phil Mag , 5th Senes, 

vol xxu , 1886, p 371. 

(7) Edgeworth, F Y., “The Choice of Means,” Phil Mag , 6th Series, 

vol xxiv , 1887, p. 268 

(8) Edgeworth, F Y , “On the Probable Errors of Frequency Constants,” 

Jour Roy Stat Soc , vol Ixxi , 1908, pp 381, 499, 651 , and 
Addendum, vol Ixxii , 1909, p 81 

(9) Eldertov, W Palin, “Tables for Testing the Goodness of Fit of Theory 

to Observation,” Biometrika, vol. i , 1902, p 155 

(10) Fisher, R A, “The Fiequency Distribution of the Values of the 

Corielation Coefficient in Samples from an Indefinitely large Popula- 
tion ” Biometrika, vol x , 1915, p 507 

(11) Girson, Winifred, “Tables for Facilitating the Computation of 

Probable Errors,” Biometrika, vol iv , 1906, p 385 

(12) Heron, D , “An Abac to determine the Probable Errors of Correlation 

Coefficients,” vol vu., 1910, p 411 (A diagiam giving 

the probabb error for any number of observations up to 1000 ) 

(13) Heron, D , “ On the Probable Error of a Partial Correlation Coefficient,” 

Biomctrika, vol vu , 1910, p. 411 (A proof, on ordinary algebiaic 
lines, for the case of thiee variables, of the result given in (33) ) 

(14) IssERLXS, L , “On the Conditions under which the ‘ Probable Errors ’ of 

Frequency Distributions have a leal Significance,” Proc Roy, Soc , 
Series A, vol xcii , 1915, p. 23. 

(16) Laplace, Pierre Simon, Marquis de, TMoiie des proibabihUs, 2® 4dn , 
1814. (With four supplements ) 

(16) Pearl, Raymond, “The Calculation of Probable Errors of Certain 
Constants of the Normal Curve,” Biometnka, vol v., 1906, p 190. 



XVIL — SIMPLER CASES OF SAMPLING FOE VARIABLES. 355 


(17) PPAKL, Raymond, **On certain Points concerning the Probable Error 

of the Standard-deviation,” Biometnka^ vol, vi , 1908, p 112. (On 
the amount of divergence, in certain cases, from the standard error 
ffjfsj 271 m the case of a normal distribution.) 

(18) Peaeson, EIael, and L N. G Filon, “On the Piobable Errors of 

Frequency Constants, and on the Influence of Random Selection on 
Variation and Correlation,” JPA^Z Tiatis. Boy, Soc.y Series A, vol czci., 
1898, p 229. 

(19) Peaeson, Kael, “ On the Criterion that a given System of Deviations 

from the Piobable in the Case of a Correlated System of Variables is 
such that it can be reasonably supposed to have arisen from Random 
Sampling,” Phil Mag , 5th Senes, vol. 1 , 1900, p. 157 

(20) Pfaeson, Kael, and others (editorial), “On the Piobable Errors of 

Frequency Constants,” vol. ii , 1903, p 273, and vol. ix., 

1913, p. 1 (Useful for the general formulse given, based on the 
general case without respect to the form of the frequency-distribution. ) 
V21) Peaeson, Kael, “ On the Curves which are most suitable for describing 
the Frequency of Random Samples of a Population,” BiometTika^ vol. 
V., 1906, p 172. 

(22) Peaeson, Kael, “Note on the Significant or Non-significant Character 

of a Sub-sample drawn from a Sample,” Biometrika, vol. v., 1906, 

p 181 

(23) Peaeson, Kael, “On the Probability that two Independent Distribu- 

tions of Frequency are really Samples from the same Population,” 
Biometiikay vol. viu , 1911, p 250, and vol x , 1914, p 85 

(24) Pearson, Kael, “On the Probable Error of a Coefficient of Coirelation 

as found from a Fourfold Table,” Bioinet-iiha^ vol ix , 1913, p 22, 

(25) Peaeson, Kael, “On the Piobable Eiior of a Coefficient of Mean 

Square Contingency,” BtoToetrikaj vol x , 1915, p. 590 

(26) Rhind, a , “Tables for Facilitating the Computation of Piobable Errors 

of the Chief Constants of Skew Frequency-distiibutions,” BioiMinka^ 
vol. vu., 1909-10, p 127 and p. 386 

(27) Sheppard, W F , “ On the Application of the Theory of Error to Cases 

of Normal Distiibution and Normal Coirelation,” Phil. TiaTis Roy 
Soc , Series A, vol cxcii , 1898, p 101 

(28) SOPEE, H E , “On the Probable Eiror of the Correlation Coefficient 

to a Second Appioximation,” Biomctnka, vol ix , 1913, p 91. 

(29) Soper, H E., “On the Piobable Erior of the Bi-seiial Expiession for 

the Correlation Coefficient,” vol x., 1914, p 384 

(30) “Student,” “On the Probable Eiroi of a Mean,” Biometriha, vol vi , 

1908, p 1 (The standard error of the mean m terms of the standard 
error of the sample.) 

(31) “Student,” “On the Probable Eiior of a Coirelation Coefficient,” 

Biometrika, vol vi , 1908, p 302 (The pioblem of the probable eiror 
with small samples ) 

(32) “Student,” “On the Distribution of Means of Samples which are not 

drawn at Random,” BioTnetrzka, vol vii , 1909, p 210. 

(33) Yule, G U , “ On the Theoiy of Con elation for any number of Vari- 

ables treated by a New System of Notation,” Proc Roy Soc.^ Series 
A, vol Ixxix , 1907, p. 182 (See pp 192-3 at end ) 

(34) Yule, G U., “On the Methods of Measuring Association between two 

Attributes,” Jour, Roy Siat Soc , vol Ixxvi ,1912 (Probable erior 
of the con elation coefficient for a fouifold table, of association co- 
efficients, etc ) 

Reference may also be made to the following, which deal for the 
most part with the effects of errors other than errors of sampling. — 



356 


THEOEY OP STATISTICS. 


(35) BowleTj a. Ij , “Relations between the Accuracy of an Aveiage and 

that of its Constituent Parts,” Jour, Roy, Stat Soc ^ vol. lx., 1897, 
p 855 

(36) Bowley, a L, “The Measurement of the Accuracy of an Average,” 

Jour, Roy, Stat Soc , vol Ixxv , 1911, p 77. 

EXERCISES. 

1. For the data in the last column of Table IX , Chap. VI. p. 95, find 
the standard error of the median (154 7 lbs ) 

2 For the same distiibution, find the standard errors of the two quai tiles 
(142*5 lbs., 168 4 lbs ) 

3. For the same distribution, find the standard error of the semi-inter- 
quartile range 

4. The standard-deviation of the same distribution is 21 *3 lbs. Fmd the 
standard error of the mean, and compare its magnitude with that of the 
standard error of the median (Qn. 1) 

5. Work out the standard error of the standard deviation for the distribu- 
tion of statures used as an illustration m § 6 (Standard -deviation 2*57 in ; 
8585 observations ) Compare the ratio of staudaid error of standard- 
deviation to the standard deviation, with the ratio of the standard error of 
the semi-interquaitile range to the semi-intei(iuartile rauge, assuming the 
distribution normal. 

6 Calculate a small table giving the standard errors of the correlation 
coefficient, based on (1) 100, (2) 1000 observations, for values of r^O, 0*2, 0 4, 
0 6, 0 8, assuming the distribution normal 



APPENDIX I. 


TABLES FOR FACILITATING STATISTICAL WORK. 

A CALCULATING TABLES. 

For heavy arithmetical work an arithmometer is, of course, 
invaluable , but, owing to their cost, arithmetic machines are, as a 
rule, beyond the reach of the student For a great deal of simple 
work, especially work not intended for publication, the student 
will find a slide-rule exceedingly useful particulais and prices 
will be found in any instrument maker’s catalogue A plain 
25-cm rule will serve for most oidmary purposes, or if greater 
accuracy is desired, a 50-cm rule, a Fuller spiral rule, or one of 
Hannyngton-pattern rules (Aston & Mander, London), in which 
the scale is broken up into a number of parallel segments, may be 
preferred. For greater exactness in multiplying or dividing, 
logarithms are almost essential five-figure tables suffice if answers 
are only desired true to five digits , if greater accuracy is needed, 
seven-figure tables must be used It is hardly necessary to cite 
special editions of tables of logarithms here, but attention may 
perhaps be directed to the recently issued eight-figure tables of 
Bauschmger and Peters (W. Engelmann, Leipzig, and Asher & Co , 
London, 1910, vol i. containing logarithms of all numbers from 
1 to 200,000, price 18s, fid. net. ; vol. ii containing logs, of 
trigonometric functions). 

If it IS desired to avoid logarithms, extended multiplication 
tables are very useful. There are many of these, and four of 
different forms are cited below. Zimmermann’s tables are inex- 
pensive and recommended for the elementary student, Cotsworth’s, 
Crelle’s, or Peters’ tables for more advanced work. Barlow’s tables 
are invaluable for calculating standard-deviations of ungrouped 
observations and similar work 

(1) Barlow’s Tables of Squares, Cubes, Square-roots, Cube-roots^ and Jlecip- 
rocals of all Integer Numbers up to 10,000 , E & F. N Spon, 
London and New York , new edition, 1930, price 7s 6d 
367 



358 THEORY OF STATISTICS. 

(2) CoTSWORTH, M B , TKb Direct Calmlfxtor^ Senes 0 (Product taWe to 

1000 X 1000 ) M*Oorquodale & Co., London , price with thumb index, 
25s , without index, 21s 

(3) Crellp., a. L., Rechentafeln (Multiplication table gmng all products up 

to 1000 X 1000 ) Can be obtained with explanatory introduction in 
German or in English G Reimer, Berlin , pnce 16s. 

(4) Elderton, W. P “Tables of Poweis of Natural Numbers, and of the 

Sums of Powers of the Natuial Numbers from 1 to 100” (gives 
powers up to seventh), Biometrika, vol ii. p 474 

(5) Peters, J , Neiie Rechentafeln fur Mulhplikation und Division (Gives 

products up to 100 x 10,000 more convenient than Cielle for forming 
four-figure products Introduction in English, French or German.) 
G. Reimer, Beilin , price 15s. 

(6) ZiMMERMANN, H , Rcclicntafel^ nebst Sammlung haufig gebrauchter 

Zahlenwerthe (Products of all numbers up to 100 x 1000 : subsidiary 
tables of squares, cubes, square loots, cube-roots and recipiocals, etc 
for all numbeis up to 1000 at the foot of the page ) W Einst k Son, 
Berlin , price 5s , English edition, Asher & Go , London, 6s. 

B, SPECIAL TABLES OE FUNCTIONS, ETC. 

Several tables of service will be found m the works cited in 
Appendix II , eg ^ a table of Gamma Functions in Elderton’s 
book (12) and a table of six-figiire logarithms of the factorials 
of all numbers from 1 to 1100 in De Morgan’s treatise (11) The 
majority of the tables in the list below, which were originally 
published in BiometriKa^ together with others, are contained m 
Tables foi Statisticians and Biometi'icians, Part I , edited by Kail 
Peaison (Biometric Laboratory, Univeisity College, London), 
price 15s net 

(7) Davenport, 0. B , Statistical Methods, with especial reference to Bio- 

logical Variation , New York, John Wiley , London, Chapman k 
Hall , second edition, 1904 (Tables of area and oidinates of the 
normal curve, gamma functions, probable enors of the coefficient of 
correlation, powers, logarithms, etc ) 

(8) Duffell, J H,, “Tables of the Gamma-function,” Biometrika, vol vii , 

1909, p 43. (Seven -fi^te logarithms of the function, proceeding by 
differences of 0 001 of the aigument ) 

(9) Elderton, W. P , “Tables for Testing the Goodness of Fit of Theoiy to 

Observation,” Biometrika, vol. i , 1902, p 166. 

(10) Everitt, P F,, “Tables of the Tetrachonc Functions for Four- 

fold Correlation Tables,” Biometrika, vol vii , 1910, p 437, and vol 
vxu,, 1912, p. 385. (Tables for facilitatmg the calculation of the cor- 
relation coefficient of a fourfold table by Pearson’s method on the 
assumption that it is a grouping of a normally distributed table , c/, 
ref i4 of Chap. XVI ) 

(11) Gibson, Wikifred, “Tables for Facilitating the Computation of Prob- 

able Errors,” Biometrika, vol. iv , 1906, p 385. 

(12) Heron, D , “An Abac to determine the Probable Errors of Correlation 

Coefficients,” Biometrika, vol vii , 1910, p 411. (A diagram giving 
the probable error for any number of observations up to 1000 ) 

(13) Lee, Alice, “ Tables of 'F{r, v) and Zr(r, v) Functions,” British Associa- 

tion Report, 1899. (Functions occurring m connection with Professor 
Peal son's frequency curves ) 



APPENDIX I. — SPECIAL TABLES OF FUNCTIONS, ETC. 359 


(14) Lee, Alice, “Tables of the Gaussian * Tail-functioiis/ when the * tail ’ 

IS larger than the body,” Biometnla, vol. x , 1914, p 208. 

(15) Rhind, a , “Tables for Facilitating the Computation of Probable Errors 

of the Chief Constants of Skew Frequency-distributions,” Biometriha, 
vol. vii , 1909-10, p 127 and p 386. 

(16) Sheppard, W F, “New Tables of the Probability Integral, 

vol 11 , 1903, p 174 (Includes not merely table of aieas of the normal 
curve (to seven figures), but also a table of the ordinates to the same 
degree of accuracy ) 

(17) Sheppard, W F, “Table of Deviates of the Normal Curve” (with 

introductory article on Grades and Deuates by Sir Francis Galton), 
Biometrika^ vol v , 1907, p 404. (A table giving the deviation of 
the normal curve, in terms of the standard-deviation as unit, for the 
ordinates which divide the area into a thousand equal parts ) 

A number of useful tables will be found in tbe series “Tracts 
for Computers,” published by the Cambridge University Press for 
the Department of Applied Statistics, University College, London 
A list is usually given m the advertisement pages of the current 
issue of Biomeh ika 



APPENDIX II. 


SHORT LIST OF WORKS ON THE MATHEMATICAL 
THEORY OF STATISTICS AND THE THEORY OF 
PROBABILITY. 

The student may find the following short list of service, as 
supplementing the lists of references given at the ends of the 
several chapters, the latter containing, as a rule, onginal memoirs 
only. The economic student who wishes to know more of the 
practical side of statistics may be referred to Mr A. L Bowley’s 
“Elements” (6 below), to An Elementary Manual of Statistics 
(3rd ed , Macdonald <fe Evans, London, 1925), by the same writer 
(useful as a general guide to English statistics), and to M. J aoques 
Bertillon’s Oours elementaire de statistique (Soci4t4 d^4ditions 
scientifiques, 1895 : international m scope). Dr A Newsholme's 
Vital Statistics (Swan Sonnenschem, 3rd edn., 1899) will also be 
of service to students of that subject 

The great majority of the works mentioned in the following 
list, with others which it has not been thought necessary to 
include, are m the library of the Royal Statistical Society 

(1) Aiav, Sir G B , On the Algebraical and Numerical Theory of Errors oj 

Observations , 1st edn , 1861 ; 3rd edn , 1879 

(2) Bernoulli, J., Ars con^ectandi^ oyus posthumum, Accedit tract atus de 

sertebus infnitts, et epistola gallic^ scripta de ludo pilot reticularis, 
1713. (A German tianslation in Ostwald’s Klassiker der emkten 
Wissenschaften, Nos 107, 108 ) 

(8) Bertrand,!. L. F., Calcul des probability \ Gauthier- Villars, Pans, 1889 

(4) Betz, 'W’., Ueber Korrelaiion , Beihefte zur Zeitschnft fur ang Psych 

und psych. Sammelforschung , J. A. Barth, Leipzig, 1911. (Applica 
tions t^o p^chology, ) 

(5) Bobel, E., El4ments de la iMomt des probabildis\ Hermann, Pans, 1909. 

(6) Bowlet, a. L., Elements of Statistics , P. S King, London , 1st edn , 

1901 ; 3rd edn., 1907. 

(7) Brown, W., The Essentials of Mental Measurement , Cambridge Uni- 

versity Press, 1911, (Part 2 on the theory of correlation : applications 
to experimental psychology ) 

(8) Bruns, H., Wahrschemhchktitsrechnung und KolleHxvmasslehre , 

Teuhner, Leipzig, 1906 


360 



APPENDIX II. — SHOET LIST OP WORKS 


361 


(9) CotTKNOTj A A , Exposition de la thiorie des chances et des probdbihUs^ 
1843. 

(10) CzuBEE, E., Wahrscheinlichkeitsrechfiung und ihre Anwendung auf 

Fehlerausgleichung, Statistik und Lchensversichei ung , Teubner, 
Leipzig, 2nd edn., vol. L, 1908-10 

(11) De Morgan, A,, Treatise on the Theory of to (extracted from 

the Encyclopmdia Metropolitana), 1837 

(12) Elderton, W. P., Frequency Curves and Correlation , C. & E Layton, 

London, 1906 (Deals with Professor Pearson's frequency curves and 
correlation, with illustrations chiefly of actuarial interest.) 

(13) Fechner, G T , Kollektivmasslehre (posthumously published ; edited 

by G F Lipps) , Engelmann, Leipzig, 1897. 

(14) Galloway, T , Treatise on Prohabihty (republished from the 7th edn. 

of the Encyclopaedia Britannica), 1839. 

(16) Gauss, C. F , M4thode des moindres carrds. Mknovres sur la comHnaison 
des ohsei vaiumSi traduits par J Bertrand, 1856, 

(16) JoHANNSEN, W., Elemente der exahten Ehhlichkeitslehre , Fischer, Jena, 

2*® Ausgabe, 1913. (Very largely concerned with an exposition of the 
statistical methods ) 

(17) Laplace, Pierre Simon, Marquis de, Essai philosophique sur les 

probability f 1814. (The introduction to 18, separately printed with 
some modifications ) 

(18) Laplace, Pierre Simon, Marquis de, Thdorie analytique des probability ; 

2nd edn ,1814, with supplements 1 to 4. 

(19) Lexis, W , Abhandlungen zur Theorie der BevolJcerungs- und Moral 

statistik , Fischer, Jena, 1903. 

(20) PoiNCARfi, H , Calcul des probability ; Gauthier- Villars, Pans, 1896 

(21) Poisson, S D., Eecheiches sur la probabiliU des gugements en mahhre 

cnminelle et en matiere civile^ precAd^es des regies genirales du calcul 
des probability, (German tianslation by 0 H Schnuse, 1841 ) 

(22) Quetelet, L a J , Lettres sur la thdoiie des probability, appliquicaux 

sciences morales et pohtiques, 1846 (English tianslation by 0. G. 
Downes, 1849.) 

(23) Thorndike, E L , An Introduction to the Theory of Mental and Social 

Measurements, Science Press, New York, 1904 

(24) Venn, J., The Logic of Chance an Essay on the Foundations and 

Province of the Theory of Probability, uiih especial reference to its 
Logical Bearings and its Application to Moral and Social Science and to 
Statistics ; 3rd edn., Macmillan, London, 1888 
(26) Westergaard, H., Lie Grundzuge der Theorie der Statistik ; Fischer, 
Tena, 1890. 



SUPPLEMENTS. 


I. NOTES SUPPLEMENTARY TO CHAPTER VI. 

6 Position of Intervals — It is said in the text that m some 
exceptional cases the observations exhibit a marked clustering 
round certain values The word exceptional should hardly have 
been used Whenever there is some doubt as to the final digit 
in reading a scale, scope is given to the idiosyncrasies of the 
observer and the distribution of frequency over the final digits 
IS rarely uniform The most conspicuous feature is usually the 
tendency to round off to the nearest unit, thus making 0 the 
most frequent final digit, but 5^s may also be emphasised if 
emphasised on the scale itself, and the excesses of O’s and 5’s 
may be drawn in the most diverse -^vays from the other parts of 
the scale. 


Table A — Frequency -distributions of Final Digits in Measurements by 
Four Observers, 


Final Digit 

Fiequency of Final Digit per 1000 

A 

B 

0. 

D 

0 

158 

122 

251 

358 

1 

97 

98 

37 

49 

2 

125 

98 

80 

90 

3 

73 

90 

72 

63 

4 

76 

100 

55 

37 

5 

71 

112 

222 

211 

6 

90 

98 

71 

62 

7 

56 

99 

75 

70 

8 

126 

101 

72 

44 

9 

129 

81 

65 

16 

Total 

1001 

999 

1000 

1000 

Actual oh- \ 
serrations / 

1258 

3000 

1 1000 

1000 


362 




SUPPLEMENTS — NOTES SUPPLEMENTARY TO CHAPTER YI. 363 

Table A shows results for four observers as illustrations, the 
frequencies being reduced for comparability to a total of 1000. 
Column A is based on measures by myself, on drawings, to the 
nearest tenth of a millimetre It is lecognised, of course, that 
measures cannot really be made to such a degree of precision ; 
but I believed that I was making them carefully, and as they 
were made with a Zeiss scale, in which the divisions are ruled 
on the undei side of a piece of plate-glass, readings are unaffected 
by parallax Nevertheless it will be seen that I heavily over- 
emphasised the zeros, and also 2, 8, and 9 — an odd selection of 
preferences * On the whole, the centre of the millimetre was 
neglected and measures piled up at the two ends 

The data for columns B, C, and D were all drawn from the 
same published report, and refer to sundry head measurements 
taken on the living subject. G-uided by a statement in the intro- 
duction, it was possible to compile the data separately for the 
three assistants (B, C, D) who had done the actual measuring. 
It will be seen that B was rather good there is a relatively slight 
excess at 0 and 5, but otherwise his measurements are fairly 
uniformly distributed 0 was decidedly not good, rounding off 
nearly one measurement in two to the nearest centimetre or 
half-centimetre. D was simply outiageously bad — so bad that 
it might have been better not to publish his measurements 
Nearly 57 per cent of his measurements are made only to the 
nearest centimetre or half centimetie — a quite inadequate degree 
of precision for head measurements often only a few centimetres 
in magnitude 

Compilation of data in the form of Table A is recommended 
some control of their value, and as a check on assistants. 

15 The Extyemely Asymmetyical or J-sliaped Distribution , — 
Dr J C Willis has shown that any number of illustrations of 
this form of distribution may be obtained by compiling the 
frequency distiibution for numbers of genera with 1, 2, 3 
species in any biological group. Table B shows the distribution 
for the Chrysomelid beetles. 


[Table 



364 


THEOKY OF STATISTICS, 


Table B. — Chrysomehdm (beetles) Numhers of Genera with 1, 2, 3 

Species (Compiled by Dr J 0 Willis, FR.S , cited fiom G U Yiile, 
“A Matbematical Theory of Evolution based on the Conclusions of Dr 
J 0 Willis,” PAtZ. , B, vol ccxiii 1924, p 85 


Species 

Genera. 

Species 

Genera 

Species 

Genera. 

1 

215 

32 

1 

74 

1 

2 

90 

33 

1 

76 

1 

3 

38 

34 

1 

77 

1 

4 

35 

35 

1 

79 

1 

5 

21 

36 

3 

83 

1 

6 

16 

37 

1 

84 

3 

7 

15 

38 

1 

87 

2 

8 

14 

39 

2 

89 

1 

9 

5 

40 

2 

92 

2 

10 

15 

41 

1 

93 

1 

11 

8 

43 

4 

110 

1 

12 

9 

44 

1 

114 

1 

13 

5 

45 

1 

115 

1 

14 

6 

46 

1 

128 

1 

16 

8 

49 

2 

132 

1 

16 

6 

50 

4 

133 

1 

17 

6 

52 

1 

146 

1 

18 

3 

53 

1 

163 

1 

19 

4 1 

56 

1 

196 

1 

20 

3 

58 

1 

217 

1 

21 

^ ! 

59 

1 

227 

1 

22 

4 

62 

1 

264 

1 

23 

5 

63 

3 

327 

1 

24 

4 

65 1 

1 

399 

1 

25 

2 

66 

1 

417 

1 

26 

3 

67 

1 

681 

1 

27 

1 

69 

1 



28 

3 i 

71 

1 



29 

3 

72 

1 

Total 

627 

30 

3 

73 

1 






SUPPLEMENTS— FOEMUL.E FOR REGRESSIONS. 


365 


II. DIRECT DEDUCTION OF THE FORMULAE 
FOR REGRESSIONS. 

(Supplementary to Cliaptei^s IX and XII,) 

To those who are acquainted with the differential calculus the 
following direct proof may he useful. It is on the lines of the 
proof given in Chapter XII § 3 

Taking first the case of two variables (Chapter IX.), it is 
required to determine values of a-^ and m the equation 

x^a^ + \,y 

(where x and y denote deviations from the respective means) 
that will make the sum of the squares of the errors like 

w — a:' — . 2 /' 

a minimum, x and y* being a pair of associated deviations 

The required equations for determining and \ will be given 
by differentiating 

^{v?) = %{x - . y)^ 

with lespect to and to and equating to zero. 

Differentiating with respect to we have 

S(a; — . j/) = 0. 

But S(aj) = S( 2 /) = 0, 

and consequently we have = 0. 

Dropping a^, and differentiating with respect to 

5(a:- . 2/)i/ == 0. 



as on p 171. 

Similarly, if we determine the values of and ^2 m the 
equation 

y=-a^-\rh^x 

that will make the sum of the squares of the eirors like 


v = 7/ bo . x' 

a minimum, we will find 




366 


THEORY OF STATISTICS. 


Tf, as m Chapter XII. §§ 4 et seq. (ef. especially § 7), a number 
of variables are involved, the equations for determining the 
coefficients will be given by differentiating 

^(•3?!— ^12 34 n • • • • • "i“^ln23 (n— 1) * 

with respect to each coefficient in turn and equating the result to 
zero This gives the equations of the form there stated If a 
constant term be introduced, its “least square” value will, be 
found to be zero, as above 

III. THE LAW OF SMALL CHANCES. 

{Supplementary to Chapter XV,) 

We have seen that the normal curve is the limit of the binomial 
(p + q)”' when n is large and neither p nor q very small. The 
student's attention will now be directed to the limit reached 
when either p ox q becomes very small, but n is so large that 
either np or nq remains finite. 

Let us regard the n trials of the event, for which the chance of 
success at each trial is jt?, as made up of m + m' = w trials , then 
the probability of having at least m successes in the m + m' 
trials is evidently the sum of the m' + 1 terms of the expansion 
of ip-^q)'^ beginning with But this probability, which we 
may term can be expressed in another and more convenient 
form with the help of the following reasoning The required 
result might happen m any one of m' + 1 ways For instance — 

(a) Each of the first m trials might succeed, the chance of 
this IS 

(b) The first m + 1 tiials might give m successes and 1 failure, 
the latter not to happen on the (m + 1)^^ trial (a condition already 
covered by (a)) But the probability of m successes and 1 failure, 
the latter at a specified trial, is p'^ q^ and, as the failure might 
occur in any one of m out of m + 1 trials, the complete probability 
of (6) IS rip'^ q 

(c) The fiist m + 2 trials might give m successes and 2 failures, 
the (m + 2)^^ trial not to be a failure (so as to avoid a repetition 
of either of the preceding cases) , the probability of this is 

m(m+l) o 

21 ^ ^ ‘ 

In a similar way we find for the contribution of m + 3 trials, 
giving m successes and 3 failures, 

ot(ot4-1)(ot + 2) - . 

Q I /'it 



SUPPLEMENTS — THE LAW OF SMALL CHANCES 


367 


Ultimately we reach 

mfi . m(m + \) ^ , m (m-t- 1) . . . . + 

1 — ^4- .... 

L ^ I lYl \ J 


This expression is of course equivalent to the first m' 4- 1 terms of 
the binomial expansion beginning with jo”*, as the student can 
verify. For instance, if m = - 2, so that m = 2, we have 


pn, 


=p''~^(\—qY + np‘^~^Q. - q)q 4 - ~ 

2 i 

=^”4“ 4- - 2^2, 

Jt * 


11% 

Let US now suppose that q is very small, so that — = ratio of 

failures to total tiials is also veiy small. Let us also suppose 

that n IS so large that nq^ki^ finite Writing ^ ~ and putting 

n 

m = n~m\ (7) becomes 




X8 


m' L * 


since — and smaller fi actions can be neglected 
n 

f A\” 

But f 1 — IS shown in books on algebra to be equal to 

where e is the base of the natural logarithms, when n is infinite 
and, under similar conditions, 



= 1 . 


Hence, if n be large and q small, we have 
(l4-A4-^4-^4- . 


m'O ‘ 


. ( 8 ) 


If we put = we have the chance that the event succeeds 
every time, and (8) reduces to Put m^ = l, and we get the 

chance that the event shall not fail more than once, e”^(l 4- A), so 
that e'^.Ais the chance of exactly one failure, and the terms 



368 


THEORY OF STATISTICS, 


within the bracket give us the proportional frequencies of 0, 1, 2, 
etc. failures. In other words, (8) is the limit of the binomial 
(p + g)” when is very small but nq finite. 

The investigation contained m the preceding paragraphs was 
published in 1837 by Poisson, so that (8) may be termed Poisson's 
limit to the binomial ; but the result has been reached indepen- 
dently by several writers since Poisson’s time, and we shall give 
one of the methods of proof adopted by modern statisticians, which 
the student may perhaps find easier to follow than that of Poisson 
(see ref. 19, p. 273). 


(p + g)” = (l-g+?)"=(l-?)9( 


1 + 



. (9) 


The first bracket on the right is equal to when q is inde- 
finitely small. Expanding the second bracket, we have 




q 1-2 




The ratio of the (r + 1)‘‘‘ to the r**' term is 


I - q r 


. {9a) 


which reduces to — when q is very small The convergence of 
r 

the series is seen from the fact that r cannot exceed and the 
substitution of this value m (9a) reduces it to 


(1-2)X’ 


which vanishes with q. 

Hence the second bracket on the right of (9) may be written 


{1+X+-+ 


3 I 




and (9) is 
identical with 


( 8 )- 




SUPPLEMENTS — THE LAW OF SMALL CHANCES. 


369 


The frequent rediscovery of this theorem is due to the fact that 
its value is felt m the study of problems involving small, inde- 
pendent probabilities For instance, if we desired to find the 
distribution of n things in IT pigeon-holes (all the pigeon-holes 
being of equal size and equally accessible), N being laige, the dis- 
tribution given by the binomial 





F ) 


would be effectively represented by (8), tables of which for 
different values of X have been published by v. Bortkewitsch and 
others 

The theorem has also been applied to cases in which, although 
the actual value of q (or p) is unknown, it may safely be assumed 
to be very small It should be noticed that, if (8) is the real law 
of distribution, certain relations must obtain between the con- 
stants of the statistics (see par 12, Chapter XIII ). Using the 
method of par 6, Chapter XY , we have for the mean 


+ + . . . .) 
=A«-*^I+X + ^+ . . . 


and for cr® 

!-^(\ + 2X2 + |^+ . . . .Vx" 




=^X-^e~^X^(l+X + ~+ . . . 


= A 


Hence any statistics produced by causes conforming to Poisson’s 
limit should, within the limits of sampling, have the mean equal 
to the square of the standard deviation For instance, in the 
statistics used m par 12 of Chapter XIII , the mean is 61, 
0-=- 78, 0-2= 6079. 


24 



S70 


THEORY OP STATISTICS 


If we now compute the theoretical frequencies from (8), putting 
61, we have the following results: — 


Deaths. 

Actual 

Frequency 

Frequency assigned 
by Poisson’s Limit 

0 

109 

108 7 

1 

65 

66 3 

2 

22 

20 2 

3 

3 

4 1 

4 

1 1 

7 (4 and over) 


The agreement here is excellent, but such a concordance is not 
very common in actual statistics. Cases do, however, occur m 
which the method is of service, and the advanced student will find 
that the reasoning illustrated is of value in many theoretical 
investigations 


IV. GOODNESS OF FIT. 

{Supplementa'i y to Chapter XVII) 

In par. 15, Chapter XV (p. 308), it was lemaiked that the general 
treatment of the problem, whether the discrepancies between 
any system of observed frequencies and those postulated by a 
theoretical law might have arisen by the operation of simple 
sampling, was beyond the scope of this work As, however, the 
student will find in the course of his leading that a test of this 
character is often applied in piactical problems, the following 
notes may be of service by way of comment on, or elucidation 
of, the highly technical papers in which the subject is fully 
discussed (see refs. 22 and 23, p. 315, and also additional 
refs, on p. 394). 

The student who has followed the argument leading up to 
the table on p 310 will have perceived that, when the frequency 
distribution of a variable is known, the probability that a set of 
observations departing from the most likely value would occui 
can be evaluated by comparing the portion of area bounded by 
the ordinate corresponding to the observed deviation with the 
whole area of the theoretical curve, and the work is illustrated 
in Examples i -iv of pp 311-313 In this case there is only a 
single variable, and the test for goodness of fit is reduced to its 
simplest terms But a consideration of Chapter XVL, and the 




SUPPLEMENTS—GOODNESS OF FIT. 


371 


relation there shown to hold between the normal curve and the 
surface of normal correlation, at once suggests that the same 
principle will apply when there are two variables. 

It was proved on pp. 319-321 that the contours of a normal 
surface are a system of concentric ellipses Now suppose we 
have a normal system of frequency in two variables x and y, 
then the chance that on simple sampling we should obtain the 
combination x' y' is measured by the corresponding ordinate of 
the surface, and the feet of all ordinates of equal height will he 
upon an ellipse which will therefore be the locus of all combina- 
tions of X and y equally likely to occur as is x' y\ Anj combina- 
tion more likely to occur than x' y will have a tailed ordinate, 
and as the locus of its foot must also be an ellipse, that ellipse 
will be contained within the x* y ellipse Conversely, combina- 
tions less likely to occur than x' y' will be represented by 
ordinates located upon ellipses wholly sm rounding the x y 
ellipse. Hence, if we dissect the surface into indefinitely thin 
elliptical slices and deteimine the total volumes of the sum of 
the slices from x^^ and y = down to = 0 and y = 0, this 
volume divided by the total volume of the surface will be the 
probability of obtaining in sampling a result not worse than 
X y' \ or, if we prefer, we may sum from = y = to 
iz; = y = 00 , and then the fraction is the chance of obtaining as 
bad a result as x! y\ or a worse result 

The reader who has compared the figures -on p 166 and 
p 246, and followed the algebra of pp 331-332, will have no 
difficulty m seeing that, when the number of vaiiables is 
3, 4 . 71, the above principle remains valid although it 

ceases to be possible to give a giaphic representation. With 
three variables the contour ellipse becomes an ellipsoidal surface, 
and the four-dimensioned frequency “ volume ” must be dissected 
into tridimensional ellipsoids , with four variables another 
dimension is involved, and so on ; but throughout the equation 
of the contour of equal probability is of the ellipse type \cf the 
generalisation of the theorems of Chapter IX. in Chapter XII,). 
Let us now suppose that if a certain set of data is deiived 
from a statistical universe conforming to a particular law, these 
data, W m number, should be distributed into n + l groups con- 
taining respectively tIq, tIj, Tig . . . . each. Instead of this 
we actually find > • • . where 

+ .... = + . nn=^F, 

The problem to be solved is whether the observed system of 
deviations from the most probable values might have arisen m 



372 


THEOEY OF STATISTICS. 


random sampling Since, iT being given, fixing the contents of 
any n of the classes determines the + there are only n 
independent variables. Let us now suppose that the distribution 
of deviations is normal Then the equation of the frequency 
‘‘solid’’ IS of the type set out in equation (12) of p 331, which 
we will write for the present m the form 

= a constant, is then the equation of the “ ellipsoid ” delimiting 
the two portions of the “volume” corresponding to combina- 
tions more or less likely to occur than . mn 

Accordingly, to find the chance of a system of deviations as 
probable as or less probable than that observed, we have to 
dissect the frequency solid, adding together the elliptic elements 
from the ellipsoid x^ ellipsoid oo, and to divide this 

summation by the total volume, t e the summation from the 
ellipsoid 0 to the ellipsoid oo . 

In this book we have been concerned with summations the 
elements of which were finite The reader is probably aware 
that when the element summed is taken indefinitely small the 
summation is called an integration, the symbol / replacing 2 or S, 
and the infinitesimal element being written dx In the present 
case we have to reduce an 7i-fold integral the summation relating 
to n elements dx^, etc To reduce this 7i-fold integral to a 
single integial, the following method is adopted. In the first 
place the ellipsoid, referred to its principal axes, is tiansformed 
into a spheroid by stretching or squeezing, and the system of 
rectangular co ordinates transformed into polar co-ordinates 

The reason for adopting the latter device is that, when two 
rectangular elements dx, dy are transformed to polar co-ordinates, 
we replace them by an angular element dB^ a vectorial element dr, 
and a term in r, the radius vector. When n such elements are 
transformed, the integral vectorial factor is raised to the w - 
power and there is an infinitesimal vectorial element, dr, and a 
“ solid ” angular element. But as the limits of integration of 
the angular (not of the vectorial) element will be the same in 
the numerator and denominator, these cancel out, while x “^7 
be treated as the vectorial element or ray Hence the multiple 
integral reduces to a single integral and the expression becomes 

. dx 

Jo ^ • <^X 



SUPPLEMENTS — GOODNESS OF FIT 


373 


the reduction of which, its integration, can be effected in terms 
of X by methods described in text-books of the integral calculus 
Everything turns, therefore, upon the computation of the function x 
As we have seen, x^ is determined by evaluating the standard 
deviations of the n variables and their correlations two at a time 
(the higher partials being deducible if the correlations of zero 
order are known) 

By an application of the method of p 257, we have 

for the standard error of sampling in the content of the class ; 
while by a similar adaptation of the reasoning on p 342 we reach 

for the correlation of eriois of sampling in the and classes 
With these data, x^ can be deduced (the actual process of reduc- 
tion is somewhat lengthy, but the student should have no difficulty 
m following the steps given m pp 370-2 of ref 116, inpa) Its 
value IS 

X ~ ^ j 

71=0 

the summation extending to all n-f-1 classes of the frequency 
distribution 

Values of the probability that an equally likely or less likely 
system of deviations will occur, usually denoted by the letter 
P, have been computed for a consideiable range of x^ of 
n' + the number of classes, and are published m the Tables 
for Statisticians and Biometricians mentioned on p 358 

The arithmetical process is illustrated upon the two examples 
of dice-throwing given on p 258 

There are thiee points which the student should note as regards 
the practical application of the method In the first place, the 
proof given assumes that deviations from the expected frequencies 
follow the noimal law This is a reasonable assumption only if 
no theoretical frequency is very small, for if it is very small the 
distribution of deviations will be skew and not normal. It is 
desirable, therefore, to group together the small frequencies in 
the “ tail of the frequency distribution, as is done in the second 
illustration below, so as to make the expected frequency a few 
units at least. In the case of the first illustration it might have 
been better to group the frequency of 0 successes with that of 



374 


THEORY OP STATISTICS, 


Twelve Dice thrown 4096 times, a throw of 4, 5, or 6 points reckoned 
a success ip 258). 


No of 
Successes 

Obsei ved 
Frequency 
(m'). 

Expected 

Frequency 

(m) 

4096(4 + 4 )'“- 

{m'-mf. 

(m' - mf 
m 

0 

0 

1 

1 

1-0000 

1 

7 

12 

25 

2 0833 

2 

60 

66 

36 

•5455 

3 

198 

220 

484 

2-2000 

4 

430 

496 

4225 

8-5354 

6 

731 

792 

3721 

4 6982 

6 

948 

924 

576 

•6234 

>1 

t 

847 

792 

3025 

3 8194 

8 

536 

495 

1681 

3-3960 

9 

257 

220 

1369 

6 2227 

10 

71 

66 

25 

3788 

11 

11 

12 

1 

0833 

12 

0 

1 

1 

1-0000 

Totals 

4096 

4096 

• 

34 5S60=x“ 


From the tables we find — 


n x‘ P 

13 SO 002792 
13 40 000072 

Hence, by inteipolation for x“ = S4 5860, P= 0015 


Txoelvc Dice thrown 4096 times, a throw of ^ points reckoned a success 


No of 
Successes 

Observed 

Frequency 

(mO 

Expected 

Fiequeucy 

(w) 

4096a + ^)^2. 

(m' - m'f 

{m' - m)® 
m 

0 

447 

459 

144 

3137 

1 

1145 

1103 

1764 

1-5993 

2 

1181 

1213 

1024 

•8442 

3 

796 

809 

169 

-2089 

4 

380 

364 

1 256 

7033 

5 

115 

116 

1 

0086 

6 

24 

27 

9 

*3333 

7 and over 

8 

5 

9 

1 8000 

Totals 

4096 

4096 


6 8113 = x“ 


Prom the tables we find ; — 


P- 

8 6 -659963 

8 6 539750 

Hence, ty interpolation for 8113, 5624 




SUPPLEMENTS — GOODNESS OF FIT. 376 

I success, and the frequency of 12 successes with that of 11 
successes. 

In the second place, the proof outlined assumes that the 
theoretical law is known a priori. In a large number, perhaps 
almost the majority, of practical cases m which the test is ap 
plied this condition is not fulfilled We determine, for example, 
the constants of a frequency curve from the observations them- 
selves, not from a priori considerations, we determine the 
“independence values” of the frequencies for a contingency 
table from the given row and column totals, again not from 
a 'priori considerations. This general case is dealt with below, 
in the section headed “Comparison Frequencies based on the 
Observations ” 

Finally, attention should be paid to the run of the signs of 
the differences rri — m The method used pays no attention to 
the order of these signs, and it may happen that has quite a 
moderate value and P is not small when all the positive differences 
are on one side of the mode and all the negative differences on the 
other, so that the mean shows a deviation from the expected value 
that IS quite outside the limits of sampling, or that the differences 
are negative in both tails so that the standard deviation shows 
an almost impossible divergence from expectation. In the first 
example on the preceding page all the differences are negative up to 
5 successes, positive from 6 to 10 successes, and negative again for 

II and 12 successes This is almost the first case supposed, and 

m fact we have already found (p. 267) that the mean deviates 

from the expected value by 5T (more precisely 5 13) times its stan- 
dard error. From Table II. of Tables for Statisticians we have • — 

Greater fraction of the area of a normal 
curve for a deviation 5*13 . . • 9999998551 

Area m the tail of the curve . . , *0000001449 

Area in both tails *0000002898 

so that the probability of getting such a deviation ( -1- or — ) on 
random sampling is only about 3 in 10,000,000 The value found 
for P (*0015) by the grouping used is therefore in some degree 
misleading If we regroup the distribution according to the 
signs of m' — 7?z, we find 


Successes. 

Obseived 

Frequency 

Expected 

Frequency 

0- 5 

1426 

1586 

6-10 

2659 

2497 

11-12 

11 

13 

Total . 

4096 

4096 




876 


THEORY OF STATI&TIGS 


For tins comparison n' is 3, is 26 96, or practically 27, and P 
IS about *000001 — a value much more nearly in accordance with 
that suggested by the mean 

Such a legrouping of the frequency distribution by the runs of 
classes that are in excess and in defect of expectation would appear 
often to afford a useful and severe test of the real extent of agree- 
ment between observation and theory In the second example 
the signs are fairly well scattered, and the regrouping has a com- 
paratively small effect , the mean being in almost precise agreement 
with expectation. The regrouped distribution is : — 


Successes 

Observed 

Frequency. 

Expected 

Frequency 

0 

447 

459 

1 

1145 

1103 

2-3 

1977 

2022 

4 

380 

364 

5-6 

139 

143 

7-8 

8 

5 

Total . 

4096 

4096 


Here n' is 6, is 5*52, and P 0 36, so that the deviations from 
expectation are still well within the range of fluctuations of 
sampling 

The value of P is the probability that a set of observations 
will occur giving a group of deviations from theory, ^ e a value 
of which IS more improbable than that observed If, to take 
the second illustration above, we were to repeat 4096 throws of 
twelve dice a large number of times, noting the throws of sixes, 
we should expect to get a worse fit to theory, ie a value of 
greater than 5 ‘81, roughly speaking 56 times in every hundred 
trials. 

The value of P corresponding to necessarily unity, 

for it IS certain that all values of x^ must exceed zero If the 
value of P corresponding to x^=l is P^ then l-P^ is the 
probability of values of x^ between 0 and 1, Similarly, if the 
value of P corresponding to x^ = 2 is Pg, then the probability of 
values of x^ between 1 and 2 is Pj - Pg, and so on Thus, for 
16 classes 16), we find in the tables : — 




SUPPLEMENTS — GOODNESS OF FIT. 


377 


X*. 

P 

Differences of P 

0 

1 

007 873 

5 

992 127 

•172 388 

10 

819 739 

•368 321 

15 

‘451 418 

*279 486 

20 

171 932 

•171 932 


We should expect, therefore, in, say, 1000 sets of random 
sampling with 16 classes, about 8 cases of between 0 and 5, 
about 172 cases between 5 and 10, 368 between 10 and 15, 
279 between 15 and 20, and 172 over 20 The following table 
shows the lesults obtained for the more modest number of 100 
sets of trials, and gives very fair agreement with theory, especially 
considering that the assumption of normality can hardly be 
strictly true. The trials were carried out by throwing 200 
beans into a revolving circular tray with sixteen equal radial 
compartments, and counting the number of beans in each com- 
partment The value of then computed, taking the 

expected frequency as 200/16 = 12 5. 


x=- 

Number of Tiials giving a Value of 
lying between the Limits on the Left 

Expected. 

Observed 

0- 5 

08 


5-10 

17 2 

20 

10-15 

86 8 

36 

15-20 

27 9 

80 5 

20 upwards 

17 2 

13 5 


If we treat this m its turn as a comparison of observation with 
theory, we find, bracketing the fiist two groups together, so as 
to reduce the number of classes to four, x^==l 28, whence from 
the tables P is approximately 0 74 That is to say, we should 
expect a worse agieement with theory about three times out 
of four. 

It follows from what was said above that, in any series of trials 
by simple sampling, equal numbers of cases should be found within 
equal intervals of P, e g. from 1 0 to 0 9, from 0*9 to 0 8, from 
0 8 to 0 7, and so on. The frequency distribution of P, that is to 





378 


THEORY OF STATISTICS. 


say, when we fulfii the conditions of simple sampling, is uniform 
over the whole range from 0 to 1. Thus for a rough grouping 
into four classes the above series of trials gave — 


p. 

Number of TiiaK giving a Value of P 
lying between the Limits on the Left 

Expected 

Observed 

1*00-0*75 

25 

23 

0 75-0 50 

25 

30 

0*50-0 26 

25 

22 

0*25-0 

1 

25 

25 


The value of this comparison is 1 52, giving P=0 68, or 

we should expect a wcirse fit roughly twice in every three trials. 

COMPAEISON FREQUENCIES BASED ON THE 
OBSERVATIONS. 

Contingency Tables. — Attention was specially directed above 
to the fact that the theoretical frequencies were assumed to be 
given a priori The theory of the more general case, in which 
comparison is made with frequencies determined by the aid of the 
observations themselves, has only recently been fully worked out 
(Tisher, lef 118) The most important practical case of the 
kind is that of association or contingency tables in which the 
observed frequencies are compared with the independence-values 
obtained from the totals of rows and columns — that is, the values 

f A n \ 



of Chapter V. § 6, p 64, and in which the differences 

^Tnn~ (-^771^71)0 

are used as an indication of the divergence from independence 
The rule to which the theory leads is a very simple one the 
method is still applicable, but the tables must be entered with n' 
equal to the number of algebraically independent frequencies (or 
values of 3) increased by unity, and not with n' equal to the 
number of compartments m the table Now, if m any column 
of the contingency table we are given all the values of S but one — 
say, the marginal value at the bottom, — the remaining one can be 
determined, because the sum of the S's for every column must be 




SUPPLEMENTS — GOODNESS OF FIT. 


379 


2 ero The same statement must hold good for every row. Hence, 
if r be the number of rows, c the number of columns, the number 
of algebraically independent values of S is (r~ l)(c- 1), and the 
tables must be entered with the value 

l)(c~ l)-bl. 

The student will realise that this is a reasonable rule if he 
considers that when we take n as the number of classes, the 
comparison frequencies being given a prioi z, we are taking it as 
one more than the number of algebraicily independent frequencies, 
since the total number of observations is fixed 

The following will seive as an illustration (Yule, ref 5 of 
Chapter Y.). Sixteen pieces of photographic paper were printed 
down to different depths of colour from nearly white to a very 
deep blackish brown. Small scraps were cut from each sheet and 
pasted on cards, two scraps on each card one above the other, 
combining scraps from the several sheets in all possible ways, so 
that there were 256 cards in the pack. Twenty observers then 
■went through the pack independently, each one naming each tint 
either “light,” “medium,” or “dark.” 

Table showing tho Name {lights medium^ or dark) amgned to each of two 
Pieces of Photographic Paper on a Card 256 Cards and 20 Observers. 
Upper figure^ observed fieqaency , ceniral figure^ independence frequency; 
bottom figure, difference 5. (Yule, ref 6 of Chap Y , Table XXI ) 


Name assigned to 
Lower Tint on 
Card 

Name assigned to Upper Tint on Card 

Total 

Light 

Medium 

Dark 


f 

850 

671 

580 

2001 

Light 


785 

683 

583 




+ 66 

-62 

- 3 



r 

618 

503 

455 

1666 

Medium , 


653 

627 

486 




-35 

+ 66 

-31 



r 

.540 

456 

457 

1453 

Dark • 

I 

570 

460 

423 



I 

-30 

- 4 

i 

+ 34 


Total 

2008 

1620 1 

1492 

5120 










380 


THEORY OF STATISTICS. 


4225/785 . 

. . 5*38 

3844/633 . 

. 6 07 

9/583 . 

02 

1225/653 . 

. 1 88 

4356/527 . 

. 8 27 

961/486 , 

. 1 98 

900/570 . 

. 1 58 

16/460 . 

03 

1156/423 . 

. 2 73 

Total 

27 94 

n* 

5 

P 

000012 


The results are shown in the preceding table, the upper figure in 
each compartment of the table being the observed frequency of 
the corresponding pair of names Below the observed frequency 
are given the independence frequency and the difference 

If* ^6 seen that the observed figures are not very close 
to the independence-values, there being apparently a marked 
tendency to give the same names to the two tints on any card, so 
that all the diagonal frequencies are m excess of the independence- 
values and all the others in defect. 

Working out as shown, the total comes to 27 94, or practically 
28 Since r and c are both 3, 7i' must be taken as (2x2) + 1 — 
that IS, 5 Turning up the tables in the column we find 

000012 — that is to say, we would only expect to find so great 
a divergence from independence, in random sampling, a little 
more than once m 100,000 trials, so the result is certainly 
significant 

Association Tables. — When we are dealing with an association 
table there are only two rows and two columns, and consequently 
n' must be taken as (2 — ])(2 — 1) + 1 — that is, 2 But no column 
for ?i' = 2 is given in Tables for Statisticians and Biometricians, the 
lowest value taken being w'= 3, and a supplementary table (XY. c) 
IS not sufficiently detailed • the necessary table, reprinted by 
permission from the Journal of the Roijal Statistical Society 
(ref 119), will be found at the end of this Supplement. As will 
be seen fiom the following illustrations, the required probability 
can also be determined from the table of areas of the normal 
curve, but it is very convenient to keep the arithmetic m the 
usual form. 

Example i. — (Data from Chapter III , p 37 ) The following 
data are there cited for colour of flower and prickliness of fruit in 
Datura the independence-frequencies have been entered below 
the numbers of observations 



SUPPLEMENTS — GOODNESS OF FIT. 


381 



Fruit 


Flower, 



Total. 

Puckly 

Smooth. 



Tiolet , . . 

47 

48 337 

12 

10 663 

59 

White . . . -j 

21 

19 663 

3 

4 337 

24 

Total . . , , 

68 

15 

83 


Here S is 1 337, and 

x2=(l 337''’10 663 ■*'19 663 

= 708 


Turning up tins value of on p 388, we find by 

interpolation P= 400. As stated m the text, the association, 
negative in this case, is “ so small that no stress can be laid on it 
as indicating anything hut a fluctuation of sampling 

Precisely the same res5ult can be arrived at by woiking out the 
standard error of the difiference between the proportions of violet 
and of white flowers that have smooth fiuits, taking the ratio of 
the difference to its standard error and then using the table of 
areas of the normal curve. Thus — 

Propoition of violet flowers that have smooth 

fruits, 12/59 or . 2033 

Proportion of white flowers that have smooth 

fruits, 3/24 or . . . . . 1250 

Difference . . . , . . .0783 

Proportion of all flowers that have smooth fruits, 

15/83 or T807 

Standard error of the diffeience between proportions of smooth 
fruits in sampling from a univeise in which the proportions are 
•1807 and 8193, and the numbers in the samples 59 and 24 
respectively . — 

^8]93x 1807(^ + ^) = -0932 

Hence the ratio of the observed difference to its standard error is 
0783/ 0932 or 840. 




382 


THEORY OF STATISTICS. 


Interpolating m the table of areas of the normal curve on 
p. 310, or taking the required figure directly fiom Table II of 
Tables for Statisticians, we have . — 

Greater fraction of area for a deviation of 84 m 

the normal curve . . . *7995 

Area m the tail .... . 2005 

Area in both tails . .... AO 1 

That 13 to say, the probability of getting a difference, of either 
sign, as great as or greater than that actually observed is 401, 
agreeing, within the accuracy of the aiithmetic, with the 
probability given by the method 

The same result would again have been obtained had we worked 
from the columns instead of from the rows, and considered the 
difference between the proportions of white flowers for prickly and 
for smooth fruits respectively. 

Example ii — (Data from ref 6 of Chapter III , Table XIV.) 
The following table shows the result of inoculation against cholera 
on a certain tea estate — 



Not-attacked 

Attacked. 

Total 

Inoculated . 

•{ 

■{ 

431 

427 7 

5 

8 3 

436 

Not inoculated , 

291 

294 3 

9 

5 7 

300 

Total . 

722 

14 

736 


As in the last example, the independence-frequencies have been 
given below the numbers observed The value of 8 is 3*3, and 


From the table on p 389 P is ‘0706. 

Working fiom the proportions attacked, we can arrive at the 
same result 

Proportion attacked amongst inoculated . . '01147 

„ „ „ not-inoculated . 03000 

Difference . , 01853 

The standard error of the difference is 




SUPPLEMENTS — GOODNESS OF FIT, 


383 


The ratio of the difference to its standard error is therefore 
01853/01025, or 1 808. 

Greater fraction of normal curve for a deviation of 1*808 is 96470 

Fraction in tail j ‘03530 

Fraction in the two tails *07060 

As before, both methods must lead to the same result. 

An Aggregate of Tables. — It may often happen that we have 
formed a number of contingency or association tables — more 
often the latter than the former — for similar data from different 
fields All may give, perhaps, a positive association, but the 
values of P may run so high that we do not feel any great con- 
fidence even in the aggregate result. The question then arises 
whether we cannot obtain a single value of P for the aggregate as 
a whole, telling us what is the probability of getting by mere 
random sampling a series of divergences from independence as 
great as or greater than those observed. The question is usually 
answered by pooling the tables j but, in view of the fallacies that 
may be introduced by pooling {cf. Chapter IV'. §§ 6 and 7), this 
method is not quite satisfactory. A better answer is given by the 
application of the present general rule. Add up all the values of 

for the different tables, thus obtaining the value of for the 
aggregate, and enter the P- tables with a value of equal to the 
total of algebraically independent frequencies increased by unity ' 
that IS, take nf as given by 

For the association table there is only one algebraically inde- 
pendent value of 3. Hence if we are testing the divergence from 
independence of an aggregate of association tables, must add 
together the values of arid enter the P-tables with w' taken as 
one more than the number of tables in the aggregate. 

Thus from ref. 6 of Chapter III, from which the data of 
Example ii. were cited, we take the following values of x^ and of 
P for six tables that include that example. They refer to six 
different estates in the same group. 


X* 

P. 

9'3i 

0022 

COS 

014 

2 51 

-11 

3 27 

•071 

5*61 

018 

1-59 

21 

28‘40 



Total , 



384 


THEORY OF STATISTICS, 


The association between inoculation and protection from attack 
is positive for each estate, but for only one of the tables is the 
value of P so small that we can say the result is very unlikely to 
have arisen as a fluctuation of sampling. Adding up the values 
of the total is 28*40, and entering the column for n' = 7 (one 
more than the number of tables considered), we find 


X®- 

P. 

28 

000094 

29 

000061 


whence by interpolation the value of P is *000081, t e we should 
only expect to get a total of as great as or greater than this, on 
random sampling, 81 times in 1,000,000 trials We can therefore 
regard the results as significant with a high degree of confidence. 

We may, I think, go further* for all the observed associations 
are positive, and in six cases there are 2® or 64 possible permuta- 
tions of sign We should therefore only expect to get an equal 
or greater total value of x^ tables all showing positive associa- 
hon, not 81 times in 1,000,000 trials but 81/64 or, roundly, 1*3 
times. P iov the ohseived event (5(x^) = 2S4 and all associations 
positive) IS therefore only 0000013 

Experimental Illustrations of the General Case — The formulae 
for the general case, as for the special case in which the frequencies 
with which compauson is made are given a priori^ can be checked 
by expel iment 

The numbers of beans counted in each of the sixteen compart- 
ments of the revolving circular tray mentioned on p 377 above 
were entered as the frequencies of a table (1) with 4 rows and 
4 columns, (2) with 2 rows and 8 columns, and the value of x^ 
computed for each table for divergence from independence For 
the two cases we have 

7i' = (3x3) + 1 = 10 
and n={\x7) + l = S 

respectively. Difirerencing the columns for P corresponding to 
these two values of n', we obtain the theoretical frequency-distri- 
butions given in the columns headed “Expectation” in Table A. 
The observed distributions of the values of x^ in 100 experimental 
tables are given m the columns headed “ Observation.” It will be 
seen that the agreement between ex\ ectation and observation is 
excellent for so small a number of observations. If the goodness 
of fit be tested by the x^ method, grouping together the frequencies 
from x^ = 15 upwards, so that n is 4, x^ is found to be 2 27 for 
the 4x4 tables and 4 36 for the 2x8 tables, giving P=0 52 in 
the first case and 0 22 in the second. 



SUPPLEMENTS — GOODNESS OF FIT. 


385 


Table A. — Theoretical Distribution of calculated from Independence-ualues^ 

in Tables with 16 Compartments ^ compared with the Actual Distributions 
given by 100 Experimental Tables. In the first case n' must be taken as 
10, in the second as 8. (Ref. 119.) 



4 Rows, 4 Columns. 

2 Rows, 8 Columns. 






X 






Expectation 

Observation 

Expectation. 

Observation. 

0- 5 

16 6 

17 

34 0 

29 5 

5-10 

48-4 

44 

47 1 

56 5 

10-15 

26 0 

32 

15-3 

10 

15-20 

7 3 

6 

3 0 

3 

20- 

1 8 

1 

0 6 

1 

Total 

100-1 

100 

100 0 

100 


For tables with 2 rows and 2 columns 350 experimental tables of 
100 observations each were available The observed distribution of 
values of calculated from the independence frequencies, is shown 
in Table B, together with the theoretical distribution obtained by 
differencing the table on pp. 385-386. Testing goodness of fit on 
Table B as it stands, n' is 10, works out at 7 53, and P is 0*583. 

Table B — Theoretical Distribution of for a Table with 2 Rows and 2 
Columns^ when calculated from the Independence-values^ compared 
with the Actual Results for 350 Experimental Tables (Ref 119 ) 


Value of 

Number of Tables 

Expected 

Obseived 

0 -0 25 

134-02 

122 

0 25-0 50 

48 15 

64 

0 60-0 75 

32 66 

41 

0 75-1 00 

24.21 

24 

1 -2 

56-00 

62 

2 -3 

25 91 

18 

3 -4 

13 22 

13 

4 -5 

7 05 

6 

5 -6 

3-86 

6 

6- 

6 01 

5 

Total 

849 99 

360 


25 




386 


THEORY OF STATISTICS. 


The theorem last given for evaluating P for an aggregate of 
tables is illustrated by the experimental data of Tables C and D 
The values of for the 350 fourfold tables of Table B were 
added together in pairs, giving 175 pairs According to theory 
the resulting frequency’-distribution for the totals of pairs of 
should be given by differencing the column of the P-table for 
n' = 3 The results of theory and observation are compared in 
the first pair of columns of Table C Testing goodness of fit, 
grouping the values of 7 and upwards, n* is 8, x^ is 5*53, and 
P IS 0 60. 

Grouping the values of x® tli® 350 experimental tables 
similarly in sets of three and summing, we get the observed 
distribution on the right of Table C, and the theoretical distribu- 
tion by differencing the column of the P-table for n—A: 
Grouping values of x^ 3 and upwards, and testing goodness of fit 
between theory and observation, n' is 9, x^ is 2 18, and P 0 97 

Table C. — Theoretical Distribution of Totals of {calculated from Independ- 
ence-mines) for Pairs and for Sets of Three Tables with 2 Hows and 2 
Columns^ compared with the Actual Distributions given by Etpei imental 
Tables v! must be taken as 3 in the fii st case, and 4 in the second 


Sum of 

Pairs of Tables 

Sets of 3 Tables 


Expectation 

Observation 

Expectation 

Obsei vation 

0-1 

68 9 

67 

23*1 

21 

1-2 

41 8 

46 

26 5 

26 

2-3 

25 3 

22 

21 0 

22 

3-4 

15 4 

19 

16 1 

19 

4-6 

9*3 

7 

10 4 

9 

5-6 

6 6 

3 

7 0 

7 

6-7 

3 4 

6 

4 6 

4 

7-8 

2 1 

3 

3 0 

4 

8- 

3 2 

2 

5 3 

4 

Total 

175 0 

176 

116 0 

116 


Table D makes a similar comparison for the values of x^i 
calculated from independence, for 100 pairs of 4x4 tables 
Here there are 9 algebraically independent S^s for each table of 
the pair, and consequently n' must be taken as 19 Differencing 
the P-table for n' = 19, the expected distribution is obtained, which 
IS shown m the first column of Table D, the observed distribution 





SUPPLEMENTS— GCX)D1TESS OP PIT. 


387 


being given in tbe second column. Taking the two groups at the 
bottom of the table together and testing goodness of fit, is 
found to be 4 1 1, n' is 5, and P is 0 39 

Table D — Tlim etical DisiriUttion of Totals of {calcxilated fiom Tndepend- 
ence-mlues) for Pairs of Tables with 4 Rows and 4 Colmnns^ compared uith 
the Actual Distribution giien by Exjoerimental Tables, 


Sum of two 

Expectation. 

Observation. 

0-10 

6-8 

8 

10-15 

27*0 

27 

15-20 

32 9 

31 

20-25 

20 8 ! 

27 

25-30 

88 

6 

30- 

3*7 

1 

Total . . ‘ 

100 0 

100 


The general theorem that n' must be taken equal to the number 
of algebraically independent frequencies increased by unity applies 
not only to association and contingency tables, but to all cases in 
which the frequencies obaerved are connected with those expected 
by a number of linear lelatioiis, beyond their restriction to tbe 
same total frequency (Fisher, ref 118) Thus, if a frequency 
curve has been fitted by the mean and standaid deviation, n* 
should be taken as 2 less than the number of classes if it has 
been fitted by the fiist tour moments, should be taken as four 
less than the number of classes. 




388 


THEOKY OF STATISTICS. 


Table of the Values of P for VivergeTice from Indejoendence vm the 
Fourfold Table 


A. — to x^=\ by steps of 0 01. 


X* 

P 

A 


P 

A 

0 « 

1 00000 

7966 

0 50 

0 47950 

436 

0-01 

0*92034 

3280 

0 51 

0 47614 

430 

0 02 

0 88754 

2506 

0-62 

0 47084 

423 ^ 

0 03 

0 86249 

2101 

0 53 

0 46661 

418 

0 04 

0 84148 

1842 

0 54 

0 46243 

411 

0 05 

0 82306 

1656 

0*55 

0*45832 

406 

0 06 

0 80650 

1516 

0*56 

0 45426 

400 

0 07 

0 79134 

1404 

0 57 

0*45026 

396 

0 08 

0 77730 

1312 

0*58 

0 44631 

389 

0 09 

0 76418 

1235 

0 59 

0 44242 

384 

0 10 

0 75183 

1169 

0 60 

0*43858 

879 

on 

0 74014 

nil 

0*61 

0 43479 

374 

0 12 

0 72903 

1060 

0 62 

0 43105 

369 

0 13 

0 71843 

1016 

0-63 

0 42736 

366 

0 14 

0 70828 

974 

0 64 

0 42371 

360 

0 15 

0 69854 

938 

0*65 

0*42011 

355 

0 16 

0 68916 

905 

0 66 

0*41656 

351 

0 17 

0 68011 

874 

0 67 

0 41305 

346 

0 18 

0 67137 

845 

0 68 

0 40959 

343 

0 19 

0 66292 

820 

0 69 

0 40616 

338 

0 20 

0 65472 

795 

0 70 

0*40278 

334 

0 21 

0 64677 

773 

0 71 

0 39944 

330 

0 22 

0 63904 

752 

0 72 

0 39614 

326 

0*23 

0 63152 

731 

0*73 

0 39288 

322 

0 24 

0 62421 

713 

0 74 

0 38966 

318 

0 25 

0 61708 

696 

0 75 

0 38648 

315 

0 26 

0 61012 

679 

0 76 

0 38333 

311 

0 27 

0 60333 

663 

0 77 

0*38022 

308 

0 28 

0 59670 

648 

0 78 

0 37714 

304 

0 29 

0 59022 

634 

0 79 

0 37410 

301 

0 30 

0 58388 

620 

0 80 

0 37109 

297 

0 31 

0 57768 

607 

0 81 

0 36812 

294 

0 32 

0 57161 

595 

0 82 

0 36518 

291 

0 33 

0 56566 

583 

0 83 

0 36227 

287 

0 34 

0 55983 

572 

0 84 

0 35940 

285 

0 35 

0 55411 

660 

0 85 

0 35655 

281 

0 36 

0 54851 

561 

0 86 

0 35374 

278 

0 37 

0 54300 

540 

0 87 

0 35096 

276 

0 38 

0 53760 

630 

0 88 

0 34820 

272 

0 39 

0 53230 

521 

0 89 

0 34548 

270 

0 40 

0 52709 

612 

0 90 

0 34278 

267 

0 41 

0 52197 

503 

0 91 

0 34011 

264 

0 42 

0 51694 

495 

0 92 

0 33747 

261 

0 43 

0 51199 

487 

0 93 

0 33486 

258 

0 44 

0 50712 

479 

0 94 

0 33228 

266 

0 45 

0 50233 

471 

0 95 

0 32972 

253 

0A6 

0 49762 

463 

0 96 

0 32719 

251 

0*47 

0*49299 

457 

0 97 

0 32468 

248 

0 48 

0 48842 

449 

0 98 

0 32220 

246 

0‘49 ! 

0 48393 

443 

0*99 

0 31974 

243 

0*50 

0*47950 

436 

1*00 

0 31731 

241 



SUPPLEMENTS— GOODNESS OF FIT, 


389 


B — to x®=10 t>y steps of 0 1 


X® 

P 


X* 

P 

A 

1 0 

0 31731 

2304 

5 5 

0 01902 

106 

1 1 

0 29427 

2095 

5 6 

0 01796 

99 

1 2 

0 27332 

1911 

5 7 

0 01697 

94 

1 3 

0 25421 

1749 

5 8 

0 01603 

89 

1 4 

0 23672 

1605 

6 9 

0 01514 

83 

1 5 

0 22067 

1477 

6 0 

0*01431 

79 

1 6 

0 20590 

1361 

6 1 

0 01352 

74 

1 7 

0 19229 

1258 

6 2 

0*01278 

71 

1 8 

0 17971 

1163 

6 3 

0 01207 

66 

1 9 

0 16808 

1078 

64 

0 01141 

62 

2 0 

0 15730 

1000 

6 5 

0 01079 

59 

2 1 

0 14730 

929 

6 6 

0*01020 

56 

2 2 

0 13801 

864 

67 

0*00964 

52 

2 3 

0 12937 

803 

6 8 

0*00912 

50 

2 4 

0 12134 

749 

6 9 

0 00862 

47 

2 5 

0 11385 

699 

7 0 

0 00815 

44 

2 6 

0 10686 

651 

7 1 

0 00771 

42 

2 7 

0 10035 

609 

72 

0 00729 

39 

2 8 

0 09426 

568 

73 

0 00690 

38 

2 9 

0*08858 

532 

7*4 

0*00652 

35 

3 0 

0 08326 

497 

7 5 

0*00617 

33 

3T 

0 07829 

465 

76 

0 00584 

32 

3 2 

0 07364 

436 

77 

0 00552 

30 

3 3 

0 06928 

408 

78 

0 00522 

28 

3 4 

0 06520 

383 

79 

0 00494 

26 

3 5 

0 06137 

359 

8 0 

0 00468 

25 

3 6 1 

0 05778 

337 

8 1 

0 00443 

24 

3 7 

0 05441 

316 

82 

0 00419 

23 

3 8 

0 05125 

296 

83 

0 00396 

21 

3 9 

0*04829 

279 

84 

0 00375 

20 

4 0 

0 04550 

262 

85 

0 00355 

19 

4 1 

0 04288 

246 

86 

0 00336 

18 

4 2 

0 04042 

231 

87 

, 0 00318 

17 

4 3 

0 03811 

217 

88 

0 00301 

16 

4 4 

0 03594 

205 

8 9 

0 00285 

15 

4 5 

0 03389 

192 

9 0 

0 00270 

14 

4 6 

0 03197 

181 

9 1 

0 00256 

14 

4 7 

0 03016 

170 

92 

0 00242 

13 

4 8 

0 02846 

160 

93 

0 00229 

12 

4 9 

0 02686 

151 

9 4 

0 00217 

12 

6 0 

0 02535 

142 

95 

0*00205 

10 

51 

0 02393 

134 

9 6 

0 00195 

11 

5 2 

0 02259 

126 

9 7 

0 00184 

10 

5 3 

0 02133 

119 

9 8 

0 00174 

9 

5-4 

0 02014 

112 

9 9 

0 00165 

8 

5 5 

0 01902 

106 

10 0 

0 00157 

8 


For values of P corresponding to to x^=30, by units, see Table XV (c), 

p 30 of Tables Joi Statisticians and Biometricians 




390 


THEOBY OF STATISTICS 


ADDITIONAL BEFERENCES 

History of Statistics (p 6) 

(1) Koren, J (edited by), The Histoiy of Statistics, their Development and 

Progress in many Conntnes, New York, The Macmillan Co , 1918 
(A collection of articles, mainly on the progress of official statistics, 
wiitten by a specialist for each conntiy ) 

(2) Walker, Helen M , Studies in the History of Statistical Method, 

Baltimore, Williams & AVilkms Co , 1929 (Most detailed on recent 
history chapters on the Normal Curve, Moments, Percentiles, 
Correlation, Spearman’s theory of Two Factors for Intelligence, 
Statistics as a Subject of Instruction m American Universities, and 
the Ongm of certain Technical Terms Useful bibhographies ) 

(3) Hotelling, H , “ British Statistics and Statisticians Today,” Jour 

Amer, Stat Assoc., vol xxv , 1930, p 186 

Contingency (p 73) 

(4) Pearson, Karl, “ On the Measurement of the Influence of Broad 

Categories on Correlation,” Biometrila, vol ix , 1913, p 116 

(5) Pearson, Karl, “ On the General Theory of Multiple Contmgency with 

Special Reference to Partial Contmgency,” Biometrika, vol xi , 1916, 
p 145 (An extension of the method of contingency coejOficients to 
classification subjected to various conditions , arithmetical examples 
are provided m the undermentioned paper ) 

(6) Pearson, Karl, and J F Tocher, “ On Criteria for the Existence of 

Differential Death-Rates,” Biometrila, vol xi , 1916, p 159 

(7) Ritchie-Scott, A , “ Tho Correlation Coefficient of a Polychoric Table,” 

Biometrila, vol xu , 1918, p 93 (Considers various methods of meas- 
urmg association with special reference to 4 x 3-fold classifications.) 

(8) Pearson, Karl, and E S Pearson, “ On Polychoric Coefficients of 

Correlation,” Biometrila, vol xiv , 1922, p 127. 

The Mode (p 130) 

(9) Doodson, Arthur T , “ Relation of the Mode, Median and Mean, m 

Frequency Cuives,” Biometnla, vol xi , 1916-17, p 429 ((^ives a 
proof of the relation noted on p 121 ) 

Index-numbers (p 130) 

There are useful discussions as to method in the followmg . — 

(10) Knibbs, G H , “ Prices, Price-Indexes, and Cost of Living in Austraha,” 

Commonwealth of Australia, Labour and Industrial Branch, Report 
No 1, 1912 

(11) Wood, Frances, “The Course of Real Wages in London, 1900-12,” 

Jour. Roy Stat Soc , vol Ixxvii , 1913-14, p 1 

(12) Working Classes, Cost of Living Committee, 1918, Report (Cd 

8980, 1918), HM Stationery Office 

(13) Bowley, a L , “ The Measurement of Changes in Cost of Livmg,” 

Jour. Roy Stat Soc , vol Ixxxu , 1919, p. 343 

(14) Bennett, T L , “ The Theory of Measurement of Changes in the Cost 

of Living,” Jour Roy Stat Soc , vol Ixxxiii , 1920, p 455 

(15) Flux, A W , “ The Measurement of Price Changes,” Jour Roy. Stat 

Soc , vol Ixxxiv., 1921, p 167. 

(16) Fisher, Irving, “The Best Form of Index-number,” Quart. Pub 

Amer. Stat Assoc., March 1921, p 533. 



StrPPLEMENTS — ^ABtHTIONAL BEFERENCES. 


S91 


(17) Persons, W M , “ Fisher's Formula for Index-numbers," Sev, Econ, 

Statistics, vol m , 1921, p 103 

(18) March, L , “ Les modes de mesure du mouvement general des pnx," 

Metron, vol i , No 4, 1921, p 40 

(19) Fisher, Irving, The Mating of Index-numbers, Houghton Mifflin Co , 

Boston and New York, 1922 (Useful as a repertory of formulse, with 
tests of the results given on certam American data , otherwise, cf 
reviews in Economic Journal, vol xxsni , p 90 and p 246, and 
Joui Boy Staf 8oc , vol Ixxxvi , p 424, and vol Ixxxvii , p 89 ) 

(20) hlARSHALL, A , Money, Credit, and Commerce, Macmillan, London, 1923 

For the student of the cost of hvmg m Great Britain the following 
are useful — 

(21) “ Labour Gazette Index Number Scope and Method of Compilation,” 

Lab Oaz , March 1920 and Feb 1921. 

(22) “ Final Report on the Cost of Livmg of the Parhamentary Committee 

of the Trades Union Congress ” (The Committee, 32 Eccleston Sq , 
London, 1921), critical notices of the same in the Labour Gazette, 
Aug and Sept 1921, and review by A L Bowley, Econ Jour , 
Sept 1921 

(23) Bowley, A L , Prices and Wages in the United Kingdom, 1914-20, 

Oxford, 1920 (Clarendon Press) 

(24) IVIarch, L , “ Rapport sur les mdices de la situation economique,” 

Bidletin de Vlnstitut International de Siatistique, t xxi , pt 2, p 3. 

(25) Gini, C , “ Quelques considerations an sujet de la construction des 

nombres indices des prix, etc ,” Metron, vol iv., 1924, p 3. 

(26) Edgeworth, F Y , “ The Plurality of Index Numbers,” Economic 

Journal, vol xxxv , 1925, p 379 

(27) Edgeworth, F Y , “ The Element of Probability m Index Numbers,” 

Jour Boy Btat /Soc , vol Ixxxvui , 1925, p 557. 

(28) Bowley, A L , “ The Influence on the Precision of Index Numbers 

of the Correlation between the Prices of Commodities,” Jour Boy, 
Stat Soc , vol. Ixxxix , 1926, p. 300 

Correlation : General, and History (p 188) 

(29) Pearson, K , “ Notes on the History of Correlation,” BiometnJca, vol 

xm , 1920, p 25 

(30) Baten, W D , “ Correction for the Moments of a Frequency Distri- 

bution m Two Variables,” Ann Math Stats , vol n , 1931, p. 309. 

(31) Frisch, Ragnar, “ Correlation and Scatter m Statistical Variables,” 

Nordic Statistical Journal, vol. i , 1929, p 36 

Fit of Eegression Lines (p 209). 

(32) Pearson, Karl, “ On the Apphcation of Goodness of Fit Tables to test 

Regression Curves and Theoretical Curves used to describe Observa- 
tional or Experimental Data,” Biometrika, vol xi., 1916-17, p 237 
(Criticises and extends the work of Slutsky ) 

(33) Fisher, R A , “ The Goodness of Fit of Regression Formnlse, and the 

Distribution of Regression Coefficients,” Jour Boy Stat, Soc , vol 
Ixxxv , 1922, p 597 

Correlation in Case of Non-linear Eegression (p 209) 

(34) WiCKSELL, S D , “ On Logarithmic Correlation, with an Apphcation to 

the Distribution of Ages at First Marriage,” Meddelandefran Lunds 
Astronomiaka Observatorium, No. 84, 1917 Svenska Aktuarie- 
foremngs Tidskrift. 



392 


THEORY OF STATISTICS* 


(35) WiCKSELL, S. D , “ The Correlation Function of Type A,” Kujigh 

Svenska V etenskapsakademiens Handl , Bd Ivni ,1917 

(36) Pearson, K , “ On a General Method of Determining the Successive 

Terms in a Skew Regression Line,” Biometrika, vol xiii , 1921, p 296 

(37) Pearson, Karl, “ On the Correction necessary for the Correlation 

Ratio Biometrikay vol xiv , 1923, p. 412 

For fittmg of polynomials, see under Correlation Time -problem 

Correlation : Effect of Errors of Observation, etc. (p 225) 

(38) Hart, Bernard, and C Spearman, “ General Abihty, its Existence 

and Nature,” Bnt Jour Psychology, vol v , 1912, p. 51. 

There has been a good deal of controversy about these formulae and 
their apphcations m psychological work cf (267) Brown and Thom- 
son, and the references there given, critical notice of the same m 
Bnt Jour Psych , vol xu , 1921, p 100, and — 

(39) Stead, H G , “The Correction of Correlation Coefficients,” Jour Roy. 

Stat Soc , vol Ixxxvi , 1923, p 412. 

Standardisation or Correction of Death-rates (p 226) 

For the methods of standardisation m present use m England and 
Wales see — 

(40) Seventy-fourth Annual Report of the Registrar-General of Births, Deaths, 

and Marriages in England and Wales (1911) [Cd. 6578, 1913 ] 

Reference may also be made to — 

(41) WoLFENDEN, H H , On the Methods of comparing the Mortahties 

of Two or More (Communities, and the Standardisation of Death- 
rates,” Jour Roy Stat Soc , vol Ixxxvm , 1923, p 399 

Correlation : Time-problem, Fitting of Trends, etc. (p 208), 
and Miscellaneous (p 226) 

(42) Harris, J Arthur, “ The Correlation between a Component, and 

between the Sum of Two or More Components, and the Sum of the 
Remammg Components of a Variable,” Quart. Pub American Stat. 
Assoc , vol XV , 1917, p 854 

(43) Yule, G XJ , “ On the Time-correlation Problem,” Jour Roy Stat. Soc , 

vol Ixxxiv , 1921, p 497 

(44) WiOKSELL, S D , “ An Exact Formula for Spurious Correlation,” 

Metron, vol i , No. 4, 1921, p 33. 

(45) Pearson, Karl, and E M. Elderton, “ On the Variate Dffieience 

Method,” Biometnka, vol xiv , 1923, p 281 

(46) Anderson, 0 , “ Ueber em neues Verfahren bei Anwendung der 

‘ Vanate-Difference ’ Methode,” Biometnka, vol. xv , 1923, p 134 

(47) Yule, G. U., “ Why do we sometimes get Nonsense Correlations 

between Time-Senes ? A Study m Samphng and the Nature of 
Time-Senes,” Jour Roy Stat Soc , vol Ixxxix , 1926, p 1 

(48) Anderson, 0 , “ Ueber die Anwendung der Differenzenmethode 

(Variate Difference Method) bei Reihenausgleichungen, Stabihtats- 
nntersuchungen, und Korrelationsmessungen,” Biometnka, vol xvin , 
1926, p 293 

(49) Gumbel, E J , “ Spurious Correlation and its Significance m Physi- 

ology,” Jour. Amer. Stat Assoc , vol. xxi , 1926, p 179. 



SUPPLEMENTS — ^ADDITIONAL REFERENCES . 


393 


(50) Smith, B B , “ Combining the Advantages of First- difference and 

Deviation-from-Trend Methods of Correlating Time Series,” Jour, 
Amer Stat Assoc , vol xxi , 1926, p 55 

(51) Anderson, 0 , “On the Logic of the Decomposition of Statistical 

Senes mto Separate Components,” Jour. Roy Stat. Soc , vol xc , 
1927, p 548 

(52) Hotelling, H , “ An Application of Analysis Situs to Statistics,” Bull 

Amer Math Soc , July-August 1927, p 467. 

(53) IssERLis, L , “ Note on Chebysheff’s Interpolation Formula,” Bio- 

metriTca, vol vix , 1927, p 87 (Fittmg polynomials ) 

(54) Anderson, Oskar, Die Korrehiionsrechnmig %n der Kon^unltur- 

forschung (Frankfurter Gesellschaft fur Konjunkturforschung), Kurt 
Schroeder, Bonn, 1929. 

(55) Darmois, G , “ Analyse et comparaison des series statistiques qui se 

developpent dans le temps,” Metrou, vol vm ,Nos 1-2, 1929, p 211 

(56) Jordan, Charles, “ Sur la determination de la tendance seculaire des 

grandeurs statistiques par la methode des momdres carres,” Jour de 
la Societe Hongroise de Stattsiique, vol. vn , 1929, p. 567. 

(57) Working, H , and H Hotelling, “Applications of the Theory of 

Error to the Interpretation of Trends,” Jour. Amer. Stat Assoc y 
vol XXIV , 1929, supplt p 73. 

(58) Allan, F E , “The General Form of the Orthogonal Polynomials for 

Simple Senes, with Proofs of them Simple Properties,” Proc. Roy 
Soc Edin , vol 1 , 1930, p 310 

(59) Rhodes, E C , “ On the Fittmg of Parabobc Curves to Statistical 

Data,” Jour Roy. Stat Soc , vol xcui , 1930, p 569. 

(60) Sipos, Alexander, “ Practical Apphcation of Jordan’s Method for 

Trehd Measurement,” Victor Horny anszky Co , Ltd , Budapest, 

(61) Well, Harry S , “ On Fittmg Curves to Observational Senes by the 

Method of Differences,” Ann Math StaU , vol i , 1930, p 159 

(62) Frisch, Raonar, “ A Method of Decomposing an Empirical Series into 

its Cyclical and Progressive Components,” Jour Amer. Stat Assoc , 
vol xxvi , 1931, supplt p. 73 

(63) Macaulay, F G , “ Smoothing of Time Senes,” New York, National 

Bureau of Economic Research, 1931 

Partial Correlation and Partial Correlation Ratio (p 252) 

(64) Kelley, T L , “ Tables to facilitate the Calculation of Partial Coeffi- 

cients of Correlation and Regression Equations,” Bulletin of ike 
Uni versity of Texa s, No. 27, 1916 (Tables giving the values of 
l/V(l-rf,)(l-c4) and -r'i,) ) 

(65) Pearson, Karl, “ On the Partial Correlation Ratio,” Proc. Roy Soc , 

Senes A, vol xci , 1915, p 492 

(66) IssERLis, L , “ On the Partial Correlation Ratio , Part u., Numerical,” 

BiometriJca, vol xi , 1916-17, p 50 

(67) Miner, J R , Tables of Vl -r^ and 1 —r^ for use in partial Correlation, 

etc , The Johns Hopkms Press, Baltimore, 1922 (Six-figure tables ) 

(68) Camp, Burton H , “ Mutually Consistent Multiple Regression Surfaces,” 

BiometriJca, vol xvu , 1926, p 443 

(69) Kelley, T L , and F S Salisbury, “ An Iteration Method for 

determinmg Multiple Correlation Constants,” Jowr Amer Stat Assoc, 
vol XXI , 1926, p 282 

(70) Ezekiel, Mordecai, “ The Determmation of Curvilinear Regression 



394 


THEORY OF STATISTICS. 


Surfaces m the Presence of Other Variables,” Jour Amer Stat Assoc , 
vol XXI , 1926, p 310 

(71) Hall, Philip, “ Multiple and Partial Coirelation Coefficients in the case 

of an Tz-Pold Variate System,” Biometnla, vol xix , 1927, p 100 

(72) Tapp AN, M , “ On Partial Multiple Correlation Coefficients m a Universe 

of Manifold Characteristics,” Biometnla, vol xix , 1927, p 39 

(73) Tschupbow, a A , transl by L Isseelis, “ The Mathematical Theory 

of the Statistical Methods employed m the Study of Correlation m the 
case of Three Variables,” Trans Oamb. Phil Soc , vol xxm , 1928, 
p 337 

(74) Ezekiel, M , “ The Application of the Theory of Eiror to Multiple and 

Curvihnear Correlation,” Jour Amer, Stat Assoc , vol xxiv , 1929, 
supplt p 99 

(75) Kelley, T L , and Q McNemae, “ Doohttle versus the Kelley-Salis- 

bury Iteration Method, for Computing Multiple Regression Coeffi- 
cients,” Jour Arne? Stat Assoc , vol xxiv , 1929, p 164 

(76) Iewin, J 0 , “Mathematical Theorems involved in the Analysis of 

Variance,” Jour Roy Stat Soc , vol xciv , 1931, p. 284 
See also the book by Ezekiel, reference (299) 

Sampling of Attributes (p. 273) 

(77) Detlefsbn, J a , “ Fluctuations of Samphng m a Mendehan Popula- 
^ tion,” Genetics, vol ui , 1918, p 599 

(78) Rhodes, E C , “ On the Problem whether two given Samples can be 

supposed to have been dravii from the same Population,” Biometnla, 
vol xvi , 1924, p 239, and Metron, vol v , 1925, p 3 

(79) Peaeson, Kael, “ On the Difference and the Doublet Tests for Ascer- 

taimng whether Two Samples have been drawn from the same 
Population,” Biometnka, vol xvi , 1924, p 249 
See also under Binomial, Normal Curve, etc , below, and the 
General References for Probable Errors on p 397. 

The Law of Small Chances (p 273) 

(80) Boetkiewicz, L von, “ Reahsmus und Formalismus in der mathe- 

matischer Statistik,” Allgemem Stat Arch , vol ix , 1916, p 225 
(Continues the discussion mitiated by the paper of Miss Whitaker, 
cited on p 273,) 

(81) Geeenwood, M , and G Udny Yule, “ On the Statistical Interpreta- 

tion of some Bacteriological Methods employed in Water Analysis,” 
Journal of Hygiene, vol xvi , 1917, p 36 (Applies a criterion 
developed from Poisson’s limit to the discrimination of water analyses, 
numerous arithmetical examples ) 

(82) “ Student,” “ An Explanation of Deviations from Poisson’s Law in 

Practice,” Biometnla, vol. x , 1919, p 211 

(83) Boetkiewicz, L von, “ Ueber die Zeitfolge Zufalliger Ereigmsse,” 

Bull de Vlnstitut Int de Stat , tome xx , 2^ hvr ,1915 

(84) Moeant, G , “On Random Occurrences m Space and Time when 

followed by a Closed Interval,” Biometnla, vol xiii , 1921, p. 309. 
See also references 114, 115 

Binomial, Normal Curve, and other Frequency Curves 
(P 314) 

(85) Thiele, T N , “ The Theory of Observations,” Ann Math, Stats , 

vol 11 ., 1931, p 165 (A complete reprmt of a work now out of prmt 
and maccessible, issued m 1903 ) 



SUPPLEMENTS — ^ADDITIONAL REPEBENCES. 


395 


(86) Peaeson, Karl, “ Second Supplement to a^Memoir on Skew Variation,” 

Phil Trans Boy ^oc , Senes A, vol. ccxvi., 1916, p 429 (Completes 
the description of type frequency curves contained m references (1) 
and (3) of p 105) 

The advanced student who desires to compare the merits of different 
frequency systems proposed, should consult refs (87) and f89) 

(87) Charlieb, C. V L , Numerous papers issued from the Astronomical 

Department of Lund, 1906-12, especially “ Contributions to the 
Mathematical Theory of Statistics ” (1912) 

(88) Doed, E L , “ On Ordmary Plane and Skew Curves,” Bulletin of the 

Umv of Texas, No 222, 1912. 

(89) Edgeworth, F. Y , “ On the Mathematical Representation of Statis- 

tical Data,” Jour Boy, Stat Soc , vol Ixxix , 1916, p 456 , btxx , 
pp 65, 266, 411 , Ixxxi., 1918, p 322 

(90) Soper, H E , Frequency Arrays, Cambridge University Press, 1922 

(91) Camp, B H , “ Probabdity Integrals for the Pomt Bmomial,” Bio- 

metnJca, vol xvi , 1924, p 163 

(92) Edgeworth, F Y., “ Untned Methods of Representing Frequency,” 

Jour Roy Stat Soc , vol Ixxxvu , 1924, p 571. 

(93) Romanovsky, V , Generalisation of some Types of the Frequency 

Curves of Professor Pearson,” Biometrika, vol xvi , 1924, p 106 

(94) Pearson, Karl, Historical Note on the Origin of the Normal Curve 

of Errors,” Biometrika, vol xvi , 1924, p 402 

(95) Camp, B H , “ Probabihty Integrals for a Hypergeometrical Series,” 

^ Biometrika, vol xvu , 1925, p 61 

(96) Dodd, E L , “ The Frequency Laws of a Function of Variables with 

given Frequency Laws,” Annals of Mathematics, vol xxvii , 1925, 

p 12 

(97) Dodd, E L , “ The Frequency Law of a Function of One Variable,” 

Bull Amer Math Soc , vol xxxi , 1925. 

(98) Rhodes, E 0., “On the Generahsed Law of Error,” Jour, Roy Stat 

Soc , vol Ixxxvm, 1925, p 576 

(99) Edgeworth, F Y., “Mr Rhodes’s Curve and the Method of Ad3ust- 

ment,” Jour Boy, Stat Soc , vol Ixxxix , 1926, p 129. 

(100) Charlieb, C V L, “A New Form of the Frequency Function,” 

Meddelande, Lunds Astronomiska Obscrvatonum, 1928 

(101) CRAMijR, H , “On some Classes of Senes used in Mathematical 

Statistics,” Ben sjette Skandinaviske Mafe7mtiLerco7igres, Copen- 
hagen, 1928 

(102) Cbam^jr, H , “On the Composition of Elementary Errors,” Slandi> 

navisk Aktuanetidsknfi, 1928 

(103) Geary, R C , “The Frequency Distribution of the Quotient of Two 

Normal Variables,” Jour Roy Stat Soc , vol xciu , 1930, p 442. 

(104) Salvosa, L R , “ Tables of Pearson’s Type III Function,” Ann 

Math Stats , vol i , 1930, p 191 

(105) Dodd, E L, “Classification of Sizes and Measures by Frequency 

Functions,” Jour Amer Stat Assoc, vol xxvi , 1931, p 277 
(A survey useful references ) 

(106) Kondo, T , and E M Elderton, “ Tables of the Functions of the 

Normal Curve to Ten Decimal Places,” Biometrika, vol xxii , 
1931, p 368 

(107) Rietz, H L., “ On certain Properties of Frequency Distributions 

obtained by a Linear Fractional Transformation of the Variates 
of a given Distribution,” Ann Math. Stats , vol u , 1931, p 38 

(The above are concerned with the general theory of frequency 



396 


THEOBY OF STATISTICS. 


systems , the following deal with the forms which are suitable for 
the representation of particular classes of data, e g statistics of 
epidemic diseases, statistics of accidents, etc ) 

(108) Bbownlbe, J , The Mathematical Theory of Random Migration 

and Epidemic Distribution,” Proc Roy Soc Ediii , vol xxxi , 
1910-11, p 262 

(109) Brownlee, J , “ Certain Aspects of the Theory of Epidemiology in 

Special Reference to Plague,” Proc Roy Soc Medicine, Sect Epi- 
demiology and State Medicine, vol x D, 1918, p 85 (The appendix 
to this paper summarises the author’s results and those of Sir Ronald 
Ross , vide infra ) 

(110) Ross, Sir Ronald, “ An Application of the Theory of Probabilities 

to the Study of a prion Pathometry,” Proc Roy Soc , A, vol xcii , 
1916, p 204 

(111) Ross, Sir Ronald, and Hilda P Hudson, “An Application of the 

Theory of Probabilities to the Study of a prion Pathometry,” Pts II 
and III , Proc Roy Soc , A, vol xciii , 1917, pp 212 and 225 

(112) Knibbs, G H , “ The Mathematical Theory of Population,” Appendix 

A to vol 1 of Census of the Commonwealth of Australia (Contains 
a full discussion of the apphcation of various frequency systems to 
vital statistics ) 

(113) Mom, H , “ Mortality Graphs,” Trans Actuarial Soc America, vol 

xviii , 1917, p 31 1 (Numerous graphs of mortahty rates m different 
classes and periods ) 

(114) Greenwood, M , and G U Yule, “ An Enquiry mto the Nature of 

Frequency Distributions representative of Multiple Happenings, 
with particular reference to the Occurrence of Multiple Attacks of 
Disease or of Repeated Accidents,” Jour Roy Stat Soc, vol 
Ixxxiii., 1920, p 255 

(115) Newbold, Ethel M , “ Practical Applications of the Statistics of 

Repeated Events, particularly to Industrial Accidents,” Jour Roy 
Stat Soc , vol xc , 1927, p 487 

Goodness of Fit (p 315 and p 370). 

(116) Pearson, Karl, “ On a Brief Proof of the Fundamental Formula for 

testing the Goodness of Fit of Frequency Distributions and on the 
Probable Error of P,” Phil Mag , vol xxx D (6th sor ), 1916, p 369 

(117) Pearson, Karl, “ Multiple Cases of Disease m the same House,” 

Biometrila, vol lx., 1913, p 28 (A modification of the goodnoss- 
of-fit test to cover such statistics as those mdicated by the title ) 

(118) Fisher, R A , “ On the Interpretation of from Contingency Tables, 

and the Calculation of P,” Jour Roy Stat, Soc , vol Ixxxv , 1922, 
p 87 

(119) Yule, G U , “ On the Apphcation of the Method to Association 

and Contingency Tables, with experimental illustrations,” J our Roy 
Stat Soc , vol. Ixxxv , 1912, p 95 After correspondence with Mr 
Fisher I wish to withdraw the statement on p 97 of this paper, 
that a full proof [of the general theorem as applied to contingency 
tables] seems still to be lacking he has convinced me that his proof 
covers the case 

The five following bear on the two preceding papers — 

(120) Pearson, Karl, “ On the Test of Goodness of Fit,” Biometrika, 

vol XIV., 1922, p 186 , and " Further Note,” ibid , p 418 

(121) Bowley, a L , and R L Connor, “ Tests of Correspondence between 

Statistical Grouping and Formulae,” Economica, 1923, p. 1. 



SUPPLEMENTS — ADDITIONAL RDPEEENCES. 


397 


(122) Fisher, R A , “ Statistical Tests of Agreement between Observation 

and Hypothesis ” (with a note m reply by A L Bowley), Econormca, 
1923, p 139 

(123) Fisher, R A , The Conditions under which measures the dis- 

crepancy between Observation and Hypothesis,” Jour. Roy Stat. 
Soc , vol Ixxxvu , 1924, p 442 

(124) Irwih, J 0 Note on the x“ Test for Goodness of Fit,” Jour. Roy. 

Stat Soc , vol xcu , 1929, p 264 

(125) Sheppard, W F , “ The Fit of a Formula for Discrepant Observa- 

tions,” Phil Trans. Roy Soc , A, vol ccxxvm , 1929, p. 228 

(126) Neymah, J , and Egoh S Pearson, “Further Notes on the x^ Distri- 

bution,” Biometrila, vol xxu , 1931, p 298 
See also references 32, 33, and 167. 

Normal Correlation, and Other Correlation Surfaces 
(p 332) 

(127) Pearson, Karl, and Others (editorial), “ Tables for Determimng the 

Volumes of a Bi-variate Normal Surface,” Btometnlca, vol xxu., 
1930, p 1 

(128) Pretorihs, S. j , “ Skew Bi-variate Frequency Surfaces, examined 

in the Ldght of Numerical Illustrations,” Biometnka, vol xxu., 
1930, p 109 

Probable Errors, Sampling, etc.: General References 
(p 355). 

(129) Tchebycheep, P L de, “ Des valeurs moyennes,” Journal de 

Mathematiques (2), vol xu , 1867, pp, 177-84 

(130) Dodd, E L , “ The Prohabihty of the Arithmetic Mean compared 

with that of certam other Functions of the Measurements,” Anmla 
of Mathematics, vol xiv , 1912-13. 

(131) IssERLis, L , “ On the Value of a Mean as calculated from a Sample,” 

Jour. Roy Stat Soc., vol Ixxxi , 1918, p 75 • 

(132) Soper, H. E , and Others, “ On the Distribution of the Correlation 

Coefficient m Small Samples,” Biometnka, vol xi , 1916-17, p. 328 

(133) Pearson, Karl, “ On the Probable Error of Bisenal rjJ" Biometnka, 

vol XL, 1916-17, p. 292 

(134) Young, Andrew, and Karl Pearson, “ On the Probable Error of a 

Coefficient of Contmgency without Approximation,” Biometnka, 
Yol XI , 1916-17, p 215 

(135) Pearson, Karl (editorial), “ On the Probable Errors of Frequency 

Constants,” Pt III , Biometnka, vol xiu , 1920, p 113. 

(136) “ Student,” “ An Experimental Determination of the Probable Error 

of Dr Spearman’s Correlation Coefficients,” Biometnka, vol xiii , 
1921, p 263 

(137) Bispham, j W , “ An Experimental Determmation of the Distribu- 

tion of the Partial Correlation Coefficient in Samples of Thirty,” 
Proc Roy Soc , A, vol xcvii , 1920, and Metron, vol n , 1923, 
P ®S4 

(138) Tschuprow, A. A , “ On the Mathematical Expectation of the 

Moments of Frequency Distributions,” Biometnka, vol xu , 1918- 
19, pp 140 and 185, and vol xui , 1921, p 283 , and Metron, 
vol u , 1923, pp 461 and 646 

(139) Fisher, R A , “ On the Probable Error of a Coefficient of Correlation 

deduced from a Small Sample,” Metron, vol i , No 4, 1921, p 3. 



398 


THEORY OF STATISTICS. 


(140) Fisher, R A , On the Mathematical Foundations of Theoretical 

Statistics,” Phil Trans , A, vol ccxxii , 1922, p 309 

(141) Camp, Burtoh H , “ A New Generalisation of Tchebycheff’s Statis- 

tical Inequahty,” Bull Amei Math Soc , vol xxviu , 1922 

(142) Meidell, H Birger, “ Sur un iirobleme du calcul des probabiht^s 

et les statistiques mathematiques,” Comptes RenduSf vol clxxv., 
1922, p. 806. 

(143) Camp, Burton H., “ Problems m Samplmg,” Jour Amer, Stat Assoc, , 

vol xvm , 1923, p 964 

(144) Dodd, E L , “ The Greatest and the Least Variate under General 

Laws of Error,” Trans Amer Math Soc , vol xxv , 1923, p 525. 

(145) Meidell, H. Birger, ** Sur la probabiht6 des erreurs,” Comptes 

BenduSf vol clxxvi , 1923, p, 280 

(146) Pearson, E S , “ The Probable Error of a Class-index Correlation,” 

Biometrika, vol xiv , 1923, p 261 

(147) Fisher, R A , “ The Distribution of the Partial Correlation Co- 

efficient,” Metron, vol lu , 1924, p 329 

(148) Pearson, E S , “ Note on the Approximations to the Probable Error 

of a Coefficient of Correlation,” BiometriJca, vol xvi , 1924, p 196. 

(149) Church, A E R , “ On the Moments of the Distribution of Squaied 

Standard Deviations for Samples of N drawn from an mdefinitely 
large Population,” Biometnka, vol xvii , 1925, p 79 

(150) Fisher, R A , “ The Theory of Statistical Estimation,” Proc Camb 

Phil Soc , vol xxii , 1925, p 700 

(151) Hotelling, Harold, “ The Distiibution of Correlation Ratios Calcu- 

lated from Random Data,” Proc Nat Acad Sci , vol xi , 1925, 
p 657 

(152) Pearson, Karl, “ Further Contributions to the Theory of Small 

Samples,” Biomet7ika, vol xvii , 1925, p 176 

(153) Splawa-Neyman, J , “ Contributions to the Theoiy of Small Samples 

drawn from a Finite Population,” Biometnka, vol xvii , 1925, 
p 472 

(154) Fisher, R A , “ Applications of ‘ Student’s ’ Distribution ” (and 
♦ following Tables by “ Student ”), Metron, vol v , No 3, p 90, 

1925 

(155) Tschuprow, a a , “ On the Asymptotic Frequency Distributions of 

the Aiithmetic Means of n Correlated Observations for very great 
Values of TO,” Jour Roy Stat Soc , vol Ixxxvm , 1925, p 91 

(156) Dodd, E L., “ The Convergence of a General Mean of Measurements 

to the True Value,” Bull Amer Math Soc , vol. xxxu , 1926 

(157) Rhodes, E C , “ The Comparison of Two Sets of Observations,” 

Jour Roy Stat Soc , vol Ixxxix , 1926, p 544 

(158) Church, A E R., “ On the Means and Squared Standard Deviations 

of Small Samples from any Population,” Biometnka, vol xvm , 
1926, p 321 

(159) Dodd, E L , “ The Convergence of General Means and the Invariance 

of Form of certain Frequency Functions,” Amer Jour Math , 
vol. xlix , 1927. 

(160) Greenwood, M , and L Isserlis, “ An Historical Note on the 

Problem of Small Samples,” Jour Roy Stat Soc , vol xc , 1927, 
p 347 

(161) Hall, Philip, “ The Distribution of Means for Samples of Size N 

drawn from a Population in which the Variate takes Values between 
0 and 1, all such Values being Equally Probable,” Biometnka, vol. 
XIX , 1927, p 240. 

(162) Irwin, J. 0 , “ On the Frequency Distribution of the Means of 



SUPPLEMENTS — ^ADDITIONAL EEPERENCES. 


399 


Samples from a Population having any Law of Frequency 'onth 
Emite Moments, etc Biometn'kay vol xix , 1&27, p 225, and 
vol XXI , 1929, p 431. 

(163) Rhodes, E C , “ The Precision of Means and Standard Deviations 

when the Individual Errors are Correlated,” Jour. Boy Btat Sot , 
vol xe , 1927, p. 135 

(164) Eishee, R a , “ The General Samphng Distribution of the Multiple 

Correlation Coefficient,” Broc Boy 8oc , A , vol cxxi , 1928, p 654. 

(165) Eishee, R a,, “Moments and Product Moments of Samphng Distri- 

butions,” Proc London Math Soc , vol xxx , 1928, p 199 

(166) Eishee, R A , and L. H. C. Tippett, “ Limitmg Eorms of the Fre- 

quency Distribution of the Largest or Smallest Member of a Sample,” 
Proc Camb Phil Soc , vol xxiv , 1928, p 180- 

(167) Neyman, J , and E S Pieaesoh, “ On the Use and Interpretation of 

Certam Test Criteria for Purposes of Statistical Inference,” Bio~ 
metnka, vol xx A, 1928 and 1929, p 175 and p 263 

(168) WiSHAET, JoHH, “ The Generalised Product Moment Distribution in 

Samples from a Normal Multivariate Population,” BiometriJcaj vol. 
XX A, 1928, p 32 

(169) Ceaig, C C , “ Samplmg when the Parent Population is of Pearson’s 

Type III BiometriM, voL xxi , 1929, p. 287 

(170) Eishee, R A., “ Tests of Significance m Harmonic Analysis,” Proc, 

Boy Soc , A , vol cxxv , 1929, p 54 

(171) Holzihgee, K S , and A E R Chuech, “ On the Means of Samples 

from a U-shaped Population,” Biometnla, vol xx a, 1929, p 361. 

(172) Iewih, J 0 , “ On the Frequency Distribution of any Number of 

Deviates from the Mean of a Sample from a Normal Population and 
the Partial Correlations between them,” Jour Boy. Stat Soc , vol 
xcii , 1929, p 580 

(173) Rondo, T , “On the Standard Error of the Mean Square Contin- 

gency,” Biometrika, vol xxi , 1929, p 376 

(174) Peaeson, Egon S , and N K Adyanthaya, “ The Distribution of 

Frequency Constants in Small Samples from Non normal Sym- 
metrical and Shew Populations Second Paper, Distribution ef 
‘Student’s’ z,” Biometrila, vol xxi , 1929, p 259 

(175) Peaeson, Egon S , “ Some Notes on Samphng Tests with Two 

Variables,” Biometriha, vol xxi , 1929, p 337 

(176) Peaeson, Kael, G B Jefeery and E M. Elderton, “ On the 

Distribution of the First Product-moment Coefficient m Small 
Samples drawn from an Indefimtely Large Normal Population,” 
Biometriha, vol xxi , 1929, p 164. 

(177) Pepper, Joseph, “ Studies in the Theory of Samplmg,” BiometnJca, 

vol XXI , 1929, p 231 (The general theory of samphng from any 
hi- variate population ) 

(178) Rider, Paul R , “On the Distribution of the Ratio of Mean to 

Standard Deviation m Small Samples from Non-normal Universes,” 
Biometriha, vol xxi , 1929, p 124. 

(179) Romanovsky, V , “ On the Moments of Means of Functions of One 

and More Random Variables,” Metron, vol vui , Nos 1 and 2, 
1929, p 251 

(180) Shohat, j (Jacques Chokhate), “ Inequalities for Moments of Fre- 

quency Functions and for Various Statistical Constants,” Bio- 
metnka, vol xxi , 1929, p 361. 

(181) Soper, H E , “ The General Samphng Distribution of the Multiple 

Correlation Coefficient,” Jour Boy Stat Soc., vol xcu , 1929, p 445 

(182) WiSHAET, John, “ The Correlation between Product Moments of any 



400 


THEORY OF STATISTICS. 


Order in Samples from a Normal Population,” Proc Roy, Soc Edin , 
vol xlix , 1929, p 1 

(183) Woo, T. L , “ Tables for ascertammg the Significance or Non- 

sigmficance of Association Measured by the Correlation Ratio,” 
Biometnkaf vol xxi , 1929, p 1. 

(184) Baker, George A , “ The Sigmficance of the Product-moment 

Coefficient, with special reference to the Marginal Distributions,” 
Jour Am&r 8tat Assoc yVo\ xxv , 1930, p 387, and the related 
Paper • Pearson, Egon S , “ The Test of the Significance for the 
Correlation Coefficient,” Jour Amer Stat Assoc , vol. xxvi , 1931, 

p 128 

(185) Baker, George A , “ Distribution of the Means of Samples of n 

drawn at random from a Population represented by a Gram- 
Charher Series,” Ann Math Stats , vol i , 1930, p 199, and note 
by C C Craig, ^b^d , vol u , 1931, p 99 

(186) Baker, George A , “ Random Samples from Non-homogeneous 

Populations,” Metron, vol vm , No 3, 1930, p 67. 

(187) Berkson, Joseph, “ Bayes’ Theorem,” Ann Math, Stats, ^ vol i , 

1930, p 42 

(188) Ezekiel, Mordecai, “ The Samphng Variability of Linear and 

Curvilinear Regression,” Ann Math Stats , vol i , 1930, p 275 

(189) Fisher, R A , “ Inverse Probabihty,” Proc. Camb Phil Soc , vol 

XXVI , 1930, p 528 

(190) Fisher, R A , “ The Moments of the Distribution for Normal Samples 

of Measures of Depaiture from Noimality,” Proc Roy Soc , A , 
vol cxxx , 1930, p 16 

(191) Hotelling, H , “ The Consistency and Ultimate Distribution of 

Optimum Statistics,” Trans Amer. Math Soc , vol xxxu , 1930, 
p 847 

(192) Irwin, J 0 , “ On the Frequency Distribution of the Means of 

Samples from Populations of certain of Pearson’s Types,” Metron, 
vol vii , No 4, 1930, p 51 

(193) Kondo, T , “ a Theory of the Samphng Distribution of Standard 

Deviations,” Biometnka, vol xxii , 1930, p 36 

(194) Pearson, Egon S., “ A Further Development of Tests for Normality,” 

Biometnka, vol xxii , 1930, p 239 

(195) Pearson, Egon S , and J Neyman, “ On the Problem of Two 

Samples,” Bull de VAcad Polonaise des Sci et des Lettres, Series A, 
1930, p 73 

(196) Smith, C D , On Generalised Tchebycheff Inequahties m Mathe- 

matical Statistics,” Amer Jour Math , vol In , No 1, 1930. 

(197) Soper, H E , “ Samphng Moments of Moments of Samples of n 

Units each drawn from an Unchanging Sampled Population, from 
the Pomt of View of Semi-mvariants,” Jour, Roy Stat Soc , vol 
xcui , 1930, p 104. 

(198) WiSHART, J , “ The Derivation of certam High-order Sampling 

Product Moments from a Normal Population,” Biometnka, vol 
XXII., 1930, p 224 

(199) Bortkiewicz, L von, “ The Relation between Stabihty and Homo- 

geneity,” Ann Math Stats , vol u , 1931, p 1 

(200) Craig, C C , “ Sampling in the Case of Correlated Observations,” 

Ann Math Stats , vol u , 1931, p 324 

(201) Hotelling, H , “ The Generalisation of ‘ Student’s ’ Ratio,” Ann. 

Math Stats , vol u , 1931, p 360 

(202) McKay, A T , “ The Distribution of the Estimated Coefficient of 

Variation,” Jour, Roy Stat Soc , vol xciv , 1931, p 564. 



SUPPLEMENTS — ^AUBITIONAL REPERENCES. 


401 


(203) Molina, E C , “Bayes’ Theorem/Mnjt Math Stats ,vol u ,1931,p 25. 

(204) Peabson, Kael, and Brenda Stoessigeb, “ Tables of the Probability 

Integrals of Symmetrical Prequency Curves m the Case of Low 
Powers, such as arise m the Theory of Small Samples,” Biometnkat 
vol xxii , 1931, p 253. 

(205) Pearson, Karl, “ On the Nature of the Relationship between Two 

of ‘ Student’s ’ Variates {z^ and 2 ,) when Samples are taken from a 
Bi- variate Normal Population,” BtometnJca, vol xxu , 1931, p. 405 

(206) Elder, Paul E , “On Small Samples from certam Non normal 

Umverses,” Ann Math Stats , vol u , 1931, p 48 

(207) WiSHART, J , “ The Mean and Second moment Coefficient of the 

Multiple Correlation Coefficient m Samples from a Normal Popula- 
tion,” Biometrila, vol xxu , 1931, p 353 (With an Editorial 
appendix of tables of the mean value and squared standard devia- 
tion of a multiple correlation coefficient ) 

On the problem of fluctuations of samplmg in correlations between 
time-senes, see also Yule (47) 

General. 

(208) Irwin, J. 0 , “ Recent Advances in Mathematical Statistics,” Jour* 

Boy Stat Soc , vol xciv , 1931, p 568 (A useful survey, with 
references, of the work of 1930 a similar article promised on the 
work of 1931 ) 

Tables of Functions, etc, (p 358) 

(209) Pearson, Karl, Tables of the Incomplete Gamma- Function , H M 

Stationery Office, London, 1922 Price £2, 2s Od net. 

(210) Pearson, Karl (edited by). Tables for Statisticians and Biometricians, 

Part II , 1931 To be obtamed from the Secretary, Biometric 
Laboratory, Umversity College, London, England Pnce 30s , 
post free (Part I , now m its second edition, price 15s , is now 
only to be had from the same address ) 

(211) British Association Mathematical Tables^ vol 1 ,, London, 1931 Office 

of the British Association, Bui'lmgton House, London, W 1, price 
10s , post free (Circular and Hyperbolic Functions , Exponential 
Sme and Cosme Integrals , Factorial (Gamma) and Derived Func- 
tions , Integrals of Probability Integral Many tables useful for 
modern statistical work ) 

Errors of Sampling in Agricultural Expenment. 

A good deal of work has been done on this particiAar branch of 
the subject, and the following references may be useful — 

(212) Berry, E A , and D G O’Brien, “ Errors m Feeding Experiments 

with Cross-bred Pigs,” Jour Agr 8ci , vol xi , 1921, p 275 

(213) Harris, J A , “ On a Criterion of Substratum Homogeneity (or 

Heterogeneity) m Field Experiments,” Amer* Naturalist, 1916, 
p 430 

(214) Hall, A D , E J Russell, T B Wood, S XJ. Pickering, S H 

Collins, “ The Interpretation of the Results of Agricultural Experi- 
ments,” Journal of the Board of Agriculture, Supplement 7, 1911 
- (Contains a collection of papers on error in field trials, feeding 

experiments, horticultural work, milk-testmg, etc ) 


26 



402 


THEOEY OF STATISTICS 


(215) Lyok, T L , “ Some Experiments to Estimate Errors in Field Plat 

Tests,” Proc Amer 8oc of Agronomy^ vol m , 1911, p 89 

(216) Mercer, W B , and A B Hael, “ The Experimental Error of Field 

Trials,” Jour, Agr 8a\ vol, iv , 1911, p 107 (With an appendix 
by “ Student ” describing the chessboard method of conductmg 
yield trials ) 

(217) Mitchell, H H , and H S Grinlley, “ The Element of Uncertainty 

m the Interpretation of Feeding Experiments,” Umv of Illinois 
Agr Expt, Station, Bull 165, 1913 

(218) Robinson, G W., and W. E Lloyd, “ On the Probable Error of 

Samphng m Soil Surveys,” Jour, Agr, 8ci,, vol vm., 1915, p 
144. 

(219) Surface, F. M , and Raymond Pearl, “ A Method of Correctmg for 

Soil Heterogeneity in Variety Tests,” Jour Agr Research, vol. v, 
1916, p 1039. 

(220) Wood, T. B , and R. A Berry, “ Variation m the Chemical Composi- 

tion of Mangels,” Jour Agr 8ci , vol i , 1905, p 16 

(221) Wood, T B , “ The Feedmg Value of Mangels,” Jour, Agr 8c% , 

vol m., 1910, p. 225. 

(222) Wood, T B , and F J M Stratton, “ The Interpretation of Experi- 

mental Results,” Jour, Agr 8ci , vol. m , 1910, p. 417. 

(223) Beaven, E. S., “ Trials of New Varieties of Cereals,” Jour, Min 

Agric (England and Wales), vol xxix , 1922, pp 337 and 436 

(224) “ Student,” “ On Testing Varieties of Cereals,” Biometrika, vol xv., 

1923, p 271, and supplementary note, vol xvi , 1924, p 411 

(225) Hatton, R G , N H Grubb, and R. C Knioht, “ Black Currant 

Trials,” Jour of Pomology and Horticultural Science, vol. iv , 1925, 

p 2 

(226) Hayes, H K , “ Control of Soil Heterogeneity and Use of the Probable 

Error Conc^t m Plant-Breeding Studies,” Univ, Minnesota Agnc 
Expt Stn , Tech Bull 30, 1925 

(227) Thought, Trevor, “ A Statistical Note on the Cotton Variety Tests 

at Sakha, 1916-20,” Min Agnc Egypt, Tech and Sci Service, Bull. 
51, 1925 

(228) Bailey, M A , and T Thought, “ An Account of Experiments carried 

out to Determine the Experimental Eiror of Field Trials with Cotton 
in Egypt,” Min, Agnc Egypt, Tech, and Sci Service, Bull. 63, 
1926 

(229) Engledow, F. L , “ a Census of an Acre of Corn,” Jour Agr Sci , 

vol XVI , 1926, p 166 , and later papers of the series in vols 
xvin , XIX , XX 

(230) Engledow, F L , and G U Yule, “ The Principles and Practice 

of Yield-Trials,” Empire Cotton-Growmg Corporation, Millbank 
House, Millbank, London, S.W 1, 1926, revised edition, 1930 
Price 2s (Reprmt from the Empire Gotton-Qr owing Review ) 

(231) Fisher, R A , “ The Arrangement of Field Experiments,” Jour Min 

Agnc (England and Wales), 1926, 

(232) Lord, L , “ The Preliminary Testmg of Pure Line Selections of Rice,” 

Tropical Agriculturist, vol Ixvn , 1926 

(233) " Student,” “ Mathematics and Agronomy,” Jour. Amer, Soc, Agro- - 

nomy, vol xviu , 1926 

(234) Eden, T , and R A Fisher, “ The Experimental Determmation of 

the Value of Top-dressmgs with Cereals,” Jour, Agr, Sci , vol xvn , 
1927, p 548. 

(235) Hubbaoz, J A , “ Samphng for Rice Yield m Bihar and Otissa,” 

Agnc Research Institute, Pusa, Bull 166, 1927. 



SUPPLEMENTS — ^ADDITIONAL REFERENCES. 


403 


(236) Moller-Aenold, E , “ Untersucliimgen uber Mogliclikeiten der 

Verminderung der Pehler von Feldversnchungen in der praxis,” 
Landw JaJirb , vol Ixv , 1927, p 943. 

(237) Hayes, H. K , and P R Immeb, “ A Study of Probable Error Methods 

in Field Experiments,” Sci Agnc , vol. viu , 1928, p 345 

(238) Maskell, E. J , “ Experimental Error,” Tropical AgnculturCt 

Trinidad, vol v , 1928, p 306, and vol. vi , 1929, pp 5, 45, 97 

(239) Neyman, J , “ The Theoretical Basis of Different Methods of Testing 

Cereals 1. The Method of E Zaleski.” (In Enghsh Reprint 
from the Journal Wiadomohi Matematyczne, 1928 ) Scientific 
Pubbcations of K. Buszczynski & Sons, Ltd., No 1, Pedigree Seed 
Cultures, Warsaw 

(240) Roemer, T , “ Les essais eomparatifs de rendements,” Bull Assoc 

Int Select Plantes Grande Gulture,^ vol i , 1928, p. 158 

(241) ClaphaM, a. R., ‘‘ The Estimation of Yield m Cereal Crops by 

Sampling Methods,” Jour Agr Sci , vol. xix , 1929, p 214. 

(242) MoLLER-iSiNOLD, E , BtT FeUversuck in der Praxis, Julius Springer, 

Berlm, 1929 

(243) WiSHABT, J., and A R Clapham, “ A Study in Sampbng Technique : 

the Effect of Artificial Fertilisers on the Yield of Potatoes,” Jour 
Agr Sci , vol xix , 1929, p 589 

(244) Fisher, R A , and J Wishart, “ The Arrangement of Field Experi- 

ments and the Statistical Reduction of the Results,” Technical 
Communication No, 10 of the Imperial Bureau of Soil Science, 
H M Stationery Office, London, 1930 (Price Is net ) 

(245) Jorgensen, M , “ Cm Beregnmg af Usikkerheden paa Fors 0 gs- 

resultater,” Tidsskr Planteavl , vol xxxvi , 1930, p 149 

(246) ICindermann, M , “ Untersuchungen uber die gunstigste Grosse von 

Versuchsteilstucken,” Landw Jahrb , vol Ixxii , 193^0, p 141 

(247) Maskell, E j , “ Field Experiments on Sugar-cane,” Tropical 

Agriculture, Trinidad, vol vn , 1930, pp 101, 125 

(248) Mitscherlioh, E A , “ Die Beurteilung der Ergebnisse von Sorten- 

und Stammanbauversuchen,” Z Zuchtung, 1930, p 223 

(249) Richey, F D , “ Some Applications of Statistical Method to Agro- 

nomic Experiments,” Jour Amer Stat Assoc , vol. xxv , 1930, 
p 269 

(250) Roemer, T , “ Der Feldversuch, erne kritische Studie,” Z Zuchtung, 

1930, p. 483 (Dritte Auflage beim Bezuge durch die D L G , 
Berhn ) 

(251) Sanders, H G , “ A Note on the Value of Uniformity Trials for 

Subsequent Experiments,” Jour. Agr Sci , vol xx , 1930, p 63 

(252) Behrens, W U , “ Zur Fehlerberechnung bei Feldversuchen nach 

der Methode Knut Vik,” Pfianzenbau, vol. viu , 1931, p 31 

(253) Christidis, B G , “ The Importance of the Shape of Plot m Field 

Experimentation,” Jour Agr Sci , vol xxi , 1931, p 14 

(254) Clapham, A R , and T Wake Simpson, “ Studies m Sampling 

Techmque Cereal Experiments. I. Field Techmque,” Agr 
Sci , vol XXI , 1931, p 366 

(255) Eden, T , “ The Experimental Errors of Field Experiments with Tea,” 

Jour Agr Sci , vol xxi , 1931, p 547 

(256) Hoblyn, T L , “ Field Experiments in Horticulture,” Technical 

Communication No 2, Imperial Bureau of Frmt Production, 

1931. 

(257) Papadakis, J , “ Some Considerations on the Technique of Field 

Experiments,” Bull Assoc Int Select Plantes Qrande Culture^ 
Yol IV , 1931, p 59, 



404 


THEORY OE STATISTICS. 


(258) Tedin, O , ‘‘ The Influence of Systematic Plot Arrangement upon 

the Estimate of Error in Field Experiments,” Jour Agr 8ci , 
vol XXI , 1931, p 191 

(259) WiSHAKT, J , “ The Analysis of Variance Illustrated m its Apphca- 

tion to a Complex Agricultural Experiment on Sugar-beet,” Archiv 
fur Pflanzenbau, Bd 5, 1931, p 561. 

Applications of Statistical Method to Engineering Problems, 

This is also a branch on which much work has been done of recent years, 
but it IS one with which I am so wholly unfamiliar that I cannot undertake 
to give any detailed bibhography The following books may be found 
useful, and wiU give references — 

(260) Becker, R , H Plaut, und I Rukge, Anwmdungen der mathe- 

matischen StahstiJc auf Problems der Massenfahnkation, Julius 
Springer, Berhn, 1927 (Reprint 1930 ) 

(261) Fry, T G , Probabihty and its Engineering Uses, London, Macmillan 

& Co , New York, D van Nostrand & Co , 1928 
(2^2) Kohlweiler, Emil, Statistik im Dienste der Techmk, R Oldenbourg, 
Munchen und Berhn, 1931 

The Reprints ” of the Bell Telephone Laboratories Incorporated, New 
York, include a number commg under the present head Mention may be 
made in particular of Reprint B-297 (leprinted from the Journal of the 
Franklin liistitute, vol ccv , 1928) . Economic As'pects of Engineering 
Applications of Statistical Methods^ by W A Shewhart, with a bibhography 

Works on Theory of Statistics, Probability, etc. 

(App II , p. 361) 

(263) Bachelier, L., Calcul des probabilites^ tome i , Gauthier- ViUars, 

Pans, 1912 

(264) Bachelier, L , LegeUf la chance, et le hasard, Flammarion, Pans, 1914 

(265) Bowley, a L , Elements of Statistics, P, S King, London, 5th ed , 

1926 (Part II , “ Apphcations of Mathematics to Statistics,” can 
be purchased separately ) 

(266) Bowley, A L , Elementary Manual of Statistics, Macdonald and 

Evans, London, 4th ed , 1928 (A new edition of tins elementary 
work, to which reference is made in Appendix II , p 360 Part II , 
dealmg with different groups of official statistics, has been largely 
rewritten.) 

(267) Brown, W , and G. Ef. Thomson, The Essentials of Mental Measure^ 

ment, 2nd ed , Cambridge University Press, 1921. 

(268) Brunt, David, The Combination of Observations, Cambridge Uni- 

versity Press, 1917 

(269) CzuBER, E , Die stat Forschungsmethode, L W Seidel, Wien, 1921 

(270) Elderton, W. Palin, Frequency Curves and Correlation, 2nd ed., 

London, C & E Layton, 1927 

(271) Fisher, Arne, The Mathematical Theory of Probabilities and its 

Application to Frequency Curves and Statistical Methods, vol i , New 
York (Macmillan), 1915 2nd ed , enlarged, 1922 

(272) Forcher, Hugo, Die statistische Methods als selbstandige Wissenschaft, 

Leipzig, 1913 (Veit) 

(273) Henry, A , Calculus and Probability for Actuarial Students, C & E, 

Layton, London, 1922 

(274) Jones, D G , A First Course in Statistics, Bell & Sons, London, 192L 



SUPPLISMENTS — ^ADDITIOZSTAL REPEEENCES 


405 


(275) JuLiK, A , Pnncipes de statishque theonque et apphquee tome i., 

Statisque theonque, Paris (Rmere), Bruxelles (De^nt), 1921. 

(276) Keynes, J M , A Treatise on Probability, Macmillan, London, 1921 

(277) West, C J, Introduction to ^ Mathematical Statistics, Adams & Co , 

Columbus, 1918 

An mexpensive reprmt of Laplace’s Essai philosopTiique (ref 17 on p 361) 
has been pubbshed by Gauthier- ViUars (Paris, 1921) m the senes entitled 
“ Les maitres de la pensee scientifique ” 


Durmg recent years mterest in statistical method has been evidenced by 
the issue of a rapidly increasmg number of books on the subject Of those 
in the following list, the first five and (288) to (290) will all be found useful 
as supplementmg the present volume Pearl’s work is specially intended 
for those interested m vital statistics, but wiU be useful also to others, 
Kelley’s book covers a great deal of ground not touch'ed m the present 
volume and, though more critical discussion of some of the methods seems 
to me desirable, the student will find much that is not otherwise accessible 
in volume form In the very useful handbook edited by H. L Bietz, each 
chapter is written by a specialist , chapters on Interpolation, Curve Pitting, 
and Periodogram Analysis, for example, all deal with matters not discussed 
in this Introduction R A Fisher’s Statistical Methods is a laboratory 
handbook rather than a text-book, and brings together m convement form 
for the research worker the numerous special methods developed, mainly 
by himself, with especial reference to small samples Whittaker and 
Robinson’s treatise is advanced and covers a wide field for statisticians and 
others The little book by the late Professor Tschuprow the student may 
not find easy reading, but it deals with fundamentals The small work 
by Rietz will mterest even the specialist. Darmois’ work is on completely 
different hues from the present and is to be recommended to the student 
of mathematical abihty The book by Westergaard and Kybolle is very 
simply and practically written, with many examples , there are chapters 
on Interpolation, Vital Statistics, and Insurance. 

(278) Peabl, R , Introduction to Medical Biometry and Statistics, W B 

Saunders Co , Philadelphia and London, 1923 , 2nd ed enlarged, 
1930 

(279) Kelley, Truman L , Statistical Method, The Macmillan Co , New 

York, 1923 

(280) Rietz, H L (edited by). Handbook of Mathematical Statistics, 

Houghton Mifflin Co , Boston, 1924 

(281) Fisher, R A , Statistical Methods for Research Workers, Oliver and 

Boyd, Edinburgh and London, 3rd ed , 1930 

(282) Whittaker, E T , and G Robinson, The Calculus of Observations, 

Blackie & Son, London, 1924 

(283) Tschuprow, A A , Grundbegnffe und Grundprobleme der Korrelations- 

theorie, Teubner, Leipzig, 1925 

(284) Kiceeoro, A , La Methode Statistique, Marcel Giard, Pans, 1925, 

(285) Secrist, H,, An Introduction to Statistical Methods, revised edition, 

The Macmillan Co , Mew York, 1925 

(286) Crum, L W , and A C Patton, Economic Statistics, A. W Shaw Co , 

Chicago and New York, A W Shaw & Co , Ltd , London, 1925. 

(287) Day, Edmund E , Statistical Analysis, The Macmillan Co , New 

York, 1925 

(288) R1ETZ5 H L , Mathematical Statistics, Open Court Puhhshmg Co , 

Chicago, 1927 (A small work, one of a series intended for those 



406 


THEORY OE STATISTICS 


who have some mathematical knowledge but are not speciahsts 
Useful references ) 

(289) DaeMois, G , Statistique MatJiemaUque, Pans, Libraine Octave Dom, 

1928. 

(290) Westergaard, H , and H 0 Nybolle, Grundzuge der Theone der 

Statistikf Fischer, Jena, 1928. (Nommally the 2nd ed of Wester- 
gaard’s work of 1890 (25, p 861), but entirely rewritten ) 

(291) Jordan, Charles, Statistique Mathematiquef Gauthier- ViUars, Pans, 

1927 

(292) Burnside, W., Theory of Probability, Cambridge Umvereity Press, 

1928 

(293) Chaddock, Robert E , Principles and Methods of Statistics, Houghton 

Mifflm & Co , Boston, 1928 

(294) Mises, R von, WahrscheinlichJceit, Statistih und Wahrheit, Sprmger, 

Berlm, 1928. 

(295) Banister, H , Elementary Applications of Statistical Method, Blackie 

& Son, Ltd , London and Glasgow, 1929 (A simple book for begm- 
ners, based on experience with students of psychology ) 


Vital Statistics. 

The two foUowmg books on vital statistics are both revised editions, 
Newsholme’s book havmg been completely rewritten 

(296) Newsholme, Sit Arthur, The Elements of Vital Statistics, revised 

edition, Allen & Unwin, London, 1923 

(297) Whipple, G C , Vital Statistics, 2nd ed , Wiley & Sons, New York j 

Chapman & Hall, London, 1923 

The student of vital statistics who wishes to go on to modern methods 
should get PearFs book (278) 

Books, Eecent. 

The precedmg hsts give books published, or of which the first edition 
was published, prior to the revision for press of the mnth edition (1929) 
of this Introduction to the Theory of Statistics The following have been 
issued since that date — 

(298) Kohn, Stanislav, Zdklady Teone Statisticke Metody {Elements of the 

Theory of Statistical Method), published by the State Statistical 
Office of the Czechoslovak Republic, Prague, 1929. (A sohd work 
of 483 pp * detailed bibhographies ) 

(299) Ezekiel, Mordeoai, Methods of Correlation Analysis, John Wiley 

& Sons, New York, Chapman & Hall, London, 1930. (Full 
treatment of methods of computation, especially the methods that 
have been developed by Amencan writers for handlmg problems 
with many variables ) 

(300) Harper, F H , Elements of Practical Statistics, Macmillan, New 

York, 1930 (A manual for students not trained m mathematics ) 

(301) March, Luoien, Les Pnncipes de la Methode Statistique, F6hx Alcan, 

Paris, 1930 (Comprehensive but elementary in treatment, and 
very lucid m style, as one has learned to expect from the Honorary 
Director of the Statistique G6n6rale de la France • illustrations and 
examples mainly from economic and demographic statistics ) 

(302) Scarborough, J B , Numerical Mathematical Analysis, Johns 

Hopkms Umversity Press, Baltimore ; Milford, London, 1930. 
(Covers the same sort of ground as Whittaker and Robinson, 
xef. (282).) 



SUPPLEMENTS — ^ADDITIONAL REFERENCES. 


407 


(303) Montessus de Baelore, R de, Probabilites et StatisUqu&s, Hemaim 

& Cie, Paris, 1931 (Applications of the binomial senes to the 
fitting of frequency distributions ) 

(304) Stepeei^sen, J F , jSome Recent Researches in Hie Theory of Statistics 

and Actuarial Science, Cambridge University Press, 1930 (The 
substance of three lectures dehvered m London ) 

(305) Mises, R von, Wahrscheinhchheitsrechnung und die Anwendung tn 

der Statistik und theoretische Pkysil, Deuticke, Wien, 1931. 

(306) Tippett, LHC, The Methods of Statistics, Williams & Norgate, Ltd , 

London, 1931 (Useful to the student already possessing some 
knowledge who wants an mtroduction to the methods of R A 
Fisher, analysis of variance, etc Illustrations mainly biological ) 

(307) Winkler, Wilhelm, Ghrundnss der Statistilc, I Theoretische Statistih, 

Juhus Sponger, Berhn, 1931. (A section of the JEnzykhpadte der 
Rechts- und Staatswissenschaft no knowledge of high ermathe- 
matics assumed ) 

(308) Woods, Hilda M., and W. T. Russell, An Introduction to Medical 

Statistics, P S Kmg ^ Son, Ltd., London, 1931. (An elementary 
mtroduction, not only to the special methods of vital statistics, 
hut to statistical method m general ) 




ANSWEES 

TO, AND HINTS ON THE SOLUTION OF, THE EXERCISES GIVEN 


CHAPTER I. 


1 . 

N 

26,287 

{AB) 

887 


(A) 

2,308 

{AC) 

374 


(5) 

2,853 

{BC) 

353 


(C) 

749 

{ABC) 

149 

2 

{ABO 

156 

{oBO 

179 


{ABy) 

431 

laBy) 

1,249 


{A$0) 

272 

{a^C) 

163 


{A$y) 

759 

{a$y) 

20,504 


8 The frequencies not given m the question itself are — 


{a) {AJB) 107 {AC) 406 {BO) 525 

{b) {A$y) 22,980 (0^7) 13,585 {a/30) 96,478 {a$y) 28,868,495 

W , {B) 

(A/3) {$) ** {AB) + {A^)^{B) + {^)' 


that IS 


(£S) U) 
{£) ^ N- 


, that IS 


, (^) 

(B)-{AB)^N-{A) 


that IS 


(a^) («)* 


5 {AB) + {BC) - {B), ^ e , the sum of the excesses of{AB) and {BC) over (-5 )/2. 
8 160. Take A = husband exceeding wife in first measurement, B = 
husband exceeding wife in second measurement, and find (aj3). 


CHAPTER 11. 

1 80/263 or 304 per thousand. 

2 56/85 or 65 per cent 

3 32 per cent, and 30 per cent, 

4 117 
6. 108. 

8 (1 “ 2g'), (1 + 2q)j le ,p must he between 0 and J (1 -2g) 01 

between J (1 + 2g^) and 

9 As a hint, remember the condition that— 

{BC)^{B)HO)-N. 

409 



410 


THEOBY OF STATISTICS. 


CHAPTER III. 

1 Deaf-mutes from childhood per million among males 222 , among 
females 1S8 , there is therefoie positive association between deaf mutism and 
male sex if there had been no association between deaf-mutism and sex, there 
would have been 3176 male and 3393 female deaf-mutes 

2 . (a) positive association, since (^^)q=:1457 

(6) negative association, since 294/490 = 3/5, 380/570 = 2/3. 

(c) independence, since 256/768 = 1/3, 48/144 = 1/3 

3. Percentage of Plants above the Aveiage Height. 

^ Parentage Crossed Self-fertilised 

Iporaaea purpurea . , . 86 pei cent. 25 per cent 

Petunia violacea . , .79 „ 17 ,, 

Reseda lutea ... 78 „ 34 ,, 

Reseda odorata , . 71 ,, 45 ,, 

Lobelia fulgens . . 60 ,, 35 ,, 

The association is much less for the species at the end than for those at the 
oeginning of the list 

4. Percentage of dark-eyed amongst the sons of dark-eyed fathers 39 per 
cent 

Percentage of dark eyed amongst the sons of not dark-eyed fathers 10 per 
cent 

If there had been no heredity, the frequencies to the nearest unit would 
have been 18, (A^)o 111, (aR)o 121, (ai8)o 760 

5. Percentage of light-eyed amongst the wives of hght-eyed husbands 69 
per cent. 

Peicentage of light-eyed amongst the wives of not light-eyed husbands 53 
per cent 

If there had been no association * {AJS)q=29S, (^;8)o = 225, (ai5)0=143, {a^)^ 
= 108 

6 The following aie the proportions of the insane per thousand in 
successive agl groups — 

In general population 0*9, 2 3, 4*1, 6'7, 6 9, 7*6, 7*7,6 8 
Amongst the blind 20 1, 16*0, 16 3, 20*7, 18 3, 17 8, 11*4, 5*3 

Note the diminishing association, which is especially clear in the age-group 
66 — , and the negative association in the last age-group The association 
coefficient gives the values below, which decrease continuously . — 

Association coefficient. -pO 92, +0 76, +0 61, +0 67, +0 46, +0 41, 
+ 0*20, -0 13 

CHAPTER IV. 

1. (7?)/iY = 6*9 per cent == 6*8 per cent. 

(Am^) =46 0 „ -44 6 „ 

mm = 3*6 (A^)m = 4*7 „ 

(AmKAfi) =41 2 „ {Amum = 54*9 „ 

=42 7 „ {AB)I{B) =29*2 „ 

{ABD)j{AB)^^l ^ „ {ABJ))I{BI))^Z5 Z „ 

The above give two legitimate comparisons. The general results are the same 
as for the hoys, ^ e a, very small association betvv een development-defects and 
dulness amongst those exhibiting nerve-signs, as compared with those who do 



ANSWERS, ETC, TO EXERCISES GIVEN 


411 


not exhibit nerve-signs, or with the girls m general As the association 
amongst those who do not exhibit nerve-signs is quite as high &s for the girls 
in geneial, the ‘‘ conclusion” quoted does not seem valid 


2. 

(1) 

(2) 


(1) 

(2) 


per 

per 


per 

per 


thousand. 

thousand. 


thousand. 

thousand 

{s)/y 

3*2 

7 5 

(AyN) 

09 

4 0 

UB)I{A) 

14 9 

11-7 

{AB)I{B) 

4-0 

6 3 


38-8 

63 0 

{Acy(.c) 

6*6 

18-8 

(ABOyiAO) 

216 

214 

{ABOUBG) 

36 8 

63*8 


The above give the two simplest comparisons, either of which is sufficient to 
show that there is a high association between blindness and mental derange- 
ment amongst the deaf mutes as well as in the general population ; amongst 
the old, the association is, in fact, small for the general population, but well- 
marked for deaf-mutes This result stands in direct contrast with that of 
Qu. 1, where the association between the two defects A and Z> was much 
smaller m the defective universe jS than m the universe at large. As previously 
stated, no great reliance can he placed on the census data as to these iixfirmities. 

3 If the cancer death rates for farmers over 45 and under 45 respectively 
were the same as for the population at large, the rate for all farmers 16 — 
would be 1 11. This is slightly less than the actual rate 1-20, but the excess 
would not justify the statement that farmers were peculiarly liable to cancer. ” 
It IS, m point of fact, due to the further differences of age distribution that we 
have neglected, e^g amongst those over 45 there are more over 55 amongst 
farmers than amongst the general population, and so on. 

4. 15 per cent 

6 If ^ and B were mdependent in both C and 7 univeises, we would have 
{A B) equal to 

471x419 . 151xl39_,^,.^ 

-6l7~ + ~383 

Actually {AB) only =858 Therefore A and B must be disassociated in one or 

both partial universes 

9 (1) 68 1 per cent. (2) 42 5 per cent. The fallacy discussed in § 2 is 
now avoided, and there seems no reason for declining to consider this as evidence 
of the effect of expenditure on election lesults. 

10 The limits to y are— 

subject to the conditions y'^x^ y<t0, No inference of a positive 

association from two negatives, is possible unless x lies between the limits 
382 . , 618 . . 

11. The limits to y are — 

( 1 ) 

6a:®), 

subject to conditions y<tO, <t4a:~ 1, ^a;. 

An inference is only possible from positive associations of .^Aaud AG x'^ 
i , an inference is only possible from two negative associations if a: lies between 
211 . . , and *274. . . Note that a? cannot exceed J 

(2) 2^<^(6a:-3a52-l) 

>|(2a?-{-3aj2), 

subject to conditions y<t0, <t5a;- 1, IJ>a:. 



412 


THEOKY OF STATISTICS 


No inference is possible from positive associations of AB and BO 
An inference is only possible from negative associations if x he between 
*183 , . . .and 215 .... Note that a? cannot exceed 

(3) y<i(ex- 2x^-1) 

>i(3x-h2x^), 

subject to the conditions 2 /<t:; 0 , <t5a;- 1, 

As in (2), no inference is possible from positive associations oi AG and BO , 
an inference is possible from negative associations if x he between 177 . . 
and 224 .... Note that x cannot exceed 


CHAPTER V. 

1. A, 0 68. B, 0 36. 


CHAPTER VI 

1 1200,200, 2 100, 20. 3 146*25. 4.216*6, 


CHAPTER VII. 

2 Mean, 166*73 lb. Median, 154 67 lb Mode (approx ) 150*6 lb (Note 
that the mean and the median should bo taken to a place of decimals further 
than IS desired for the mode : the true mode, found by fitting a theoretical 
frequency cuive, is 151 1 lb ) 

3 Mean, 0'6330 Median, 0 6391. Mode (approx), 0*651. (True mode 
IS 0 653 ) 

4 £35 5 approximately 

5 (1)116 0 (2) Means 77*4, 89 0, ratio 114*9 (3) Geometrical means 77 2, 
88*9, rations 2 (4)115 2 

6 (1) 921,607. (2) 916,963 

7. 1st qiial 10s. 6jd 2nd qual 93, 2jd. 

8 np If the teims of the given binomial seiies are multiplied by 0, 1, 2, 3 
. , . , note that the resulting series is also a binomial w hen a common jfaotoT 
IS removed. [The full proof is given in Chapter XV. § 6.] 


CHAPTER VIII. 

2 Standard deviation 21*3 lb Mean deviation 16 4 lb Lower quartile 
142 5, upper quartile 168*4; whence ^=12 95. Ratios m d /s d =0 77, 
Q/sd =0 61. Skewness, 0 29. 

3 Approximately lower quartile =£26*1, upper quartile =£54 *6, ninth 
decile =£94 

5 (1) i/=73 2, er=17 3. (2) jlf=73 2, <r=17'5 (3) Jf=73% a- = 18 0. 

(Note that while the mean IS unaffected in the second place of decimals, the 
standai d dev iation is the higher the coarser the grouping ) 

6. \/n pq. The proof is given m Chapter XY § 6 

7. The assumption that observations are evenly distributed over the 



AifSWERS ETC., TO EXERCISES GIVEN. 


413 


intervals does not affect the sum. of deviations, except for the interval in which 
the mean or median lies : for that interval the sum is (0 25 hence the 
entire correction is 

“ti3)+W2(0 25 + ^f^). 

In this expression d is, of course, expressed as a fraction of the class-interval, 
and is given its proper sign, Notice that the % and % of this question are 
not the same as the iVj and 1®* 


CHAPTER IX. 

1 414, (ry = 2 280, r= +0*81. X=:0-5F-h0 5 r=l*3X-Ml. 

2 Using the subscripts 1 for earnings, 2 for pauperism, 3 for out relief ratio, 

-3^3=5*79, 0-3=3 09 : ris= —0 13, +0 60. 


CHAPTER XI. 

1. 1 *232 per cent, (against 1 240 per cent ) * 2*556 in. against 2 672 in. 

2 The coiTected standard-deviation is 0 9954 of the rough value. 

3. Estimated true standaid-deviation 6 91 * standard- deviation of fluctua- 
tions of sampling 9 38 (The latter, which can be independently calculated, 
is too low, and the former consequently probably too high Cf, Chap XIV. 
§ 10 ) 

4 0 43 

6. 58 per cent. 

7 

8. 0*30 

~ + 

The otheis may be written down from symmetry. 

10 (1) No effect at all (2) If the mean value of the errors in variables is 
d, and in the weights the value found for the weighted mean is — 

The true value -P - r o-* — :• 

'i4?(w;-Pe) 

If ns small, d is the important teira, and hence errors in the quantities are 
usually of more importance than errors in the weights If r become considei- 
able, eirors in the weights may be of consequence, but it does not seem probable 
that the second term would become the most important in practical cases 

11 ^ = 2/3 

12 ^ = 0 77. 

CHAPTER XII. 

1 7123= +0*759, 7*13-2= +0*097, r23i= -0*436. 

<71-23=2 64, 0-213 = 0*594, <rsi2=70*l. 

Xj = 9 31 + 3 37 X3 + O 00364X3. 



414 


THEORY OF STATISTICS. 


1*12 34~ ^ 680, 1*13 24“ "I" 0*803, 7*14 23“ ^ 397* 

^23 14~ -~0'433, 7*24 13= “0 553, 7*34«i2“ “0*149. 

<ri* 2 S 4 = 9 17, (T^ 134=49 2, hts ]24=12 6 , 123 = 105*4 

Xi = 63 + 0*127 X^ + 0 687 X 3 + O 0346 

3 The correlation of the ^th order is r/(l +pr) Hence if r be negative, the 
con elation of order 71 -2 cannot be numencally gi eater than unity and r 
cannot exceed (numencally) l/(7i-l). 

4 

^ ^X2*S= - ^13 S“^23 1 — 

6 1*12 3 = ^18 a“^23*l= “ 

CHAPTER Xril 

1 Theo i/=6, (r=l*732 Actual if= 6*1 16, tr=l 732. 

2 (a) Theo. i/=2 5, <r=l 118 * Actual if =2 48, o' = l*14. 

{b) „ i/=3, tf-=1 226* „ if=2 97, (r = l 26. 

(c) „ if=3 5, «r = l 323 „ if= 3 47, 0 - = I 40. 

3. Theo i/'=50, ir = 5 Actual if=50 11, <r = 5 *23 

4 The standard deviation of the proportion is 0 00179, and the actual 
divergence is 5 4 times this, and therefore almost certainly significant 

5. The standard deviation of the number drawn is 32, and the actual 
difference fiom expectation 18. There is no significance 

6 p=l“(r2/i/, 7i = if/p *J7=0 510, 71 = 12 0 p = 0 464, 71=110*4. 

8 Standard deviation ot simple sampling 23 0 per cent. The actual 
standard-deviation does not, therefore, seem to indicate any real variation, but 
only fluctuations of sampling 

9. Difference from expectation 7*6 standaid error 10 0. The difference 
might therefore occur frequently as a fluctuation of sampling 

10. The test can be applied either by the formulae of Case II. or Case III, 
Case II. 13 taken as the simplest 

(a) (AB)I(B) = 69'1 per cent (^i3)/(i8) = 80*0 per cent Diffeience 10*9 
percent (Ji)/A^=7T1 pei cent and thence € 12 = 12 9 per cent The actual 
difference is less than this, and would frequently occur as a fluctuation of 
simple sampling 

(b) {AB)]iB) = 70'l per cent. * iAfi)l(fi)=:6i 3 per cent. Difference 5 8 per 
cent {A)/N=Q7 6 per cent, and thence € 12 =3 40 per cent. The actual 
difference is 1 7 times this, and might, rather mfiequently, occur as a fluctua- 
tion of simple sampling 


CHAPTER XIV. 


Row 

(Tp, 

Group of Rows. 

CTp. 

1 

3 1 

5, 6, and 7 

2 1 

2 

2*1 

8, 9, 10, and 11 

1 6 

3 

1*7 

12, 13, and 14 

1 2 

4 

2*7 

16 and upwards 

1*1 


iTp is given in units per 1000 births, as s and Sq, 

2. Sa = 7 02, and (rp=^2 b units. 

B <j^=^n pq&s if the chance of success weie p m all cases (but the mean la 
n/2 mtp 7i). 

4 Mean number of deaths per annum = = 680, 

0^=666,682. r=0*000029. 



ANSWERS, BTC., TO EXERCISES GIVEN. 


415 


CHAPTER XV. 


I* 


( 1 ) 0 1 

1 12 

2 66 

3 220 

4 495 

5 792 

6 924 


(2) 0 459 4 

1 1102 6 

2 1212 8 

3 808 6 

4 363 9 


7 792 

8 495 

9 220 
10 66 
11 12 
12 1 

Total, 4096 

5 116‘4 

6 27 2 

7 4 7 

8 6 


Total, 4096 2 


(3) 0 192 

1 288 

2 144 

3 24 

Total, 648 


2 The frequency of r successes is greater than that of r-1 so long as 
r<np +p if 7ip IS an integer, r=^np gives the greatest teim and also the mean. 
3. This follows at once from a consideiation of the Galton-Peaison apparatus. 


4 . Binomial Noiunal curve. 

1 1 7 

10 10*6 

45 42 7 

120 116 1 

210 211*5 

252 258 4 

210 211*5 

etc etc. 


6 The data are i/=68 855, o'=2 56, 8. 

6 (1) United Kingdom— diiect 1 75, horn standard-deviation 1 73. 

(2) Cambiidge students — direct 1 68, fiom standard-deviation 1 *73. 

7 70 6 per cent 8 27 per cent, 

9 (1) In a 12 4 per cent., 6 10 percent, of the trials, assuming normality, 
but the assumption is hardly quite valid (2) a about 13 times m 100,000 
trials , 6 practically impossible, being a deviation of over 7 times the standard 
error. 

10. 853. 11. Mean 74 3, standard-deviation 3 23. 


CHAPTER XVI. 

3 Fiom equations (10) and (11) replace a-j and o-g by and Sg in equation 
(9) Regarding this as an equation for r, note that is a max im u m when 
tau 2 6 IS infinite, or 6=45% 



416 


THEORY OR STATISTICS. 


4. In fig. 50, suppose every horizontal airay to be given a slide to the right 
until its mean lies on the vertical axis through the mean of the whole distribu- 
tion • then suppose the ellipses to be squeezed in the direction of this vertical 
axis until they become circles The original quadrant has now become a 
sector with an angle between one and two right angles, and the question is 
solved on determining its magnitude. 


CHAPTER XVII. 

1. Estimated frequency 1554, staudaid eiror 0 28 lb. 2 Lower Q, 
frequency 1472, standard enor 0 26 lb. , uppei Q, fiequency 1116, standard 
error 0 34 lb 3. 0 18 lb 4 0 24 lb , 17 pei cent less than the standard 
error of the median 5. 0 0196 in or 0 76 per cent, of the standard-deviation 
the standard error of the semi-mterquartile range is 1 23 per cent of that 
range. 


r. 

71=100. 

71=1000. 

0 0 

OT 

0*0316 

0 2 

0 096 

0*0304 

0 4 

0 084 

0 0266 

0 6 

0 064 

0*0202 

0 8 

0 036 

0 0114 



INDEX. 


[The references are to pages The subject-matter of the Exercises given at 
the ends of the chapters has been indexed only when such exercises (or 
the answers thereto) give the constants for statistical tables in the text, 
or theoretical results of general interest , m all such cases the number of 
the question cited is given In the case of authors’ names, citations in 
the text are given first, followed by citations of the authors’ papers or 
books m the lists of references.] 


Ability, general, refs , 392. 

Accident, deaths from (law of small 
chances), 265-266 

Accidents, frequency-distributions, 
refs , 396 

Achenwall, Gottfried, Ahnss der 
Staatswissenschaft, 2 

Adyanthaya, N. K , refs , samphng, 
399 

Ages, at death of certain women 
(table), 78 , of husband and wife 
(correlation), 159 , diagram, 173 , 
constants, (qu 3) 189 

Aggregate, of classes, 10-11 

Agricultural labourers’ earnings See 
Earnings 

Agriculture, experiment, errors in, 
refs , 401-404 

Airy, Sir G B , use of terms “ error 
of mean square ” and “ modulus,” 
144 Kefs , Theory of Errors of 
Observation^ 360 

Allan, E E , refs , fitting poly- 
nomials, 393 

Ammon, 0 , hair and eye-colour data 
cited from, 61 

Analysis, harmonic See Harmomc 
analysis 

Analysis of variance See Variance 

Analysis Situs, refs , Hotellmg, 393 

Anderson, 0 , correlation difference 
method, 198 , refs , 208, 392, 
393 

Annual value of dwelhng-houses 
(table), 83 , of estates m 1715, 
table, 100 , diagram, 101 

Arithmetic mean See Mean, arith- 
metic 


Array, def , 164 , standard-devia- 
tion of, 177, 204-205, 236-237 , m 
normal correlation, 319-321 
Association, generally, 25-59 , def , 
28 , degrees of, 29-39 , testing by 
comparison of percentages, 30-35 , 
constancy of difference from in- 
dependence values for the second- 
order frequencies, 35-38 , co- 
efficients of, 37-39 , illusory oi 
misleadmg, 48-51 , total possible 
number of, for n attnbutes, 54r-56 , 
case of complete independence, 
56-57 , use of ordmary correlation - 
coefficient as measure of asso- 
ciation, 216-217 ; Pearson’s co- 
efficient based on normal corre- 
lation (refs ), 40, 333 ; refs , 15, 
39-40, 333 

Association, partial, generally, 42- 
59 , the problem, 42^3 , total 
and partial, def , 44 , arithmetical 
treatment, 44-48 , testing, m 
Ignorance of third-order frequen 
cies, 51-54 , refs , 57, 

— examples moculation against 
cholera, 31-32, 34-35, 382-384, 
deaths and occupation, 52-53 , 
deaf mutism and imbecility, 32- 
33 , eye-colour of- father and son, 
33-34 ; eye-colour of grandparent, 
parent, and offsprmg, 46-48, 53- 
54 , colour and pricklmess ot 
Datura fruits, 36-37, 377-378 , 
defects m school children, 45-46 
Asymmetrical frequency - distribu- 
tions, 90-102 , relative positions 
of mean, median, and mode m^ 
27 


417 



418 


THEORY OF STATISTICS 


121-122 , diagrams, 113-114 See 
also Frequency- distributions 
Asymmetry in frequency-distribu- 
tions, measures of, 107, 149-150 
Attributes, theory of, generally, 
1-59 , def , 7 , notation, 9-10, 
14-15 , positive and negative, 10 , 
order and aggregate of classes, 
10-11 , ultimate classes, 12 , 
positive classes, 13-14 , consist- 
ence of class-frequencies, 17-24 
{see Consistence) , association of, 
25-59, 377-381 {see Association) , 
sampling of, 254-334 {see Sam- 
plmg of attributes) 

Averages, generally, 106-132 , def , 

107 , desirable properties of, 107- 

108 , forms of, 108 , average m 
sense of arithmetic mean, 109 , 
refs , 129-130 See Mean, Median, 
Mode 

Axes, principal, m correlation, 321- 
322 

Bachelieb, L , refs , Cakul des 
probabihfes, 404 , Lejeu, la chance 
et le hasa'id, 404 

Bailey, M A , refs , cotton trials, 402 
Baker, G A , refs , samphng, 400 
Banister, H , lefs , Elementary Appli- 
cations of Statistical Method, 406 
Barlow, P , tables of squares, etc , 
67 , refs , 357 

Barometer heights, table, 06 , dia- 
gram, 97 , means, medians, and 
modes, 122 

Bateman, H , refs , law of small 
chances, 273 

f Baten, W D , refs , moments 
(correlation), 391 

Bateson, W , data cited from, 37, 
380-381 

Beaven, E S , refs , yield tiials, 402 
Becker, B , refs , Anwendungen der 
math Statistik auf Probleme der 
Massenfabrikation, 404 
Beetles {Chrysomehdce), sizes of 
genera, 363-364 

Beeton, Miss M , data cited from, 78 
Behrens, W U , refs , yield tiials, 
403 

Bennett, T L , refs , cost of living, 
390 

Berkson, J , refs , Bayes’ Theorem, 
400 


Bernoulh, J., refs , Ars Gonjectandi, 
360 

Berry, B A , lefs , variation in 
mangels, 402 , errors in feeding 
experiments, 401 

Bertillon, J , ref , Cours elementaira 
de statishque, 6, 360 

Bertrand, J L F , refs , Calcul dea 
piobahihtes, 360 

Betz, W , ref , Ueber Korr elation, 360 

Bias in sampling, 261-262, 279-281, 
336-337, 343, “353 

— in scale-reading, 362-363 

Bielfeld, Baron, J F von, use of 
word “ statistics,” 1 

Binomial senes, 291-300 , genesis 
of, in samphng of attributes, 
291-293 , calculated series foi 
different values of p and n, 294, 
295 , experimental illustrations 
of, 258, 259, (qu 1 and qu 2) 274, 
371 , graphic method of forming 
a repiesentation of senes, 295- 
297 , mechanical method of f oi m- 
mg a representation of senes, 297- 
299 , refs , 313 , direct deter- 
mination of mean and standard- 
deviation, 299-300 , deduction of 
noimal curve from, 301-302 , 
refs , 314 

Bispham, J W, refs, cirois of 
sampling in paitial eoiielations, 
397 

Blakoman, J , refs , tests lor hne- 
anty of logicvssion, 209, 354 , prob- 
a])lo erioi of contingency co- 
efficient, 354 

Boole, G , refs , Laws of Thought, 23. 

Booth, Charles, on pauperism, 193, 
195 

Borel, E , refs , Thiorie des proba- 
bihies, 360 

Bortkewitsoh (Bortkiewicz), L von, 
law of small chances, 265-266, 370, 
time-distributions, 389 , refs , law 
of small chances, 273, 394 , 
samplmg, 400 

Bowley, A L , refs , effect of errors 
on an average, 356 , on samplmg, 
354 , Measurement of Groups and 
Series, 354 , Elements of Statistics, 
360, 404 , Elementary Manual of 
Statistics, 360, 404 , cost of living, 
390 , index numbers, 391 , Prices 
and Wages, 1914-20, 391 



INDEX. 


419 


Bowley, A L , and R. L. Connor, 
goodness of fit, ref , 396 

Bravais, A , refs , correlation, 188, 
332. 

British Association, data cited from, 
stature, 88 , weight, 95, see 
Stature , Weight , Reports on 
mdex-numbers , refs , 130-131 , 
Address by A L Bowley on sam- 
pling, 354 , mathematical tables, 
401 

Brown, J W , refs , index-correla- 
tions, 226, 252, 

Brown, W , refs , effect of experi- 
mental errors on the correlation- 
coefficient, 226 , The Essentials of 
Mental Measurement, 360, 404. 

Brownlee, J , refs , frequency curves 
(epidemiology and random migra- 
tion), 396 

Bruns, H., refs , W ahrscheinlich- 
Iceitsrechnung und Kollekhvmass- 
lehre, 360 

Brunt, D , refs , The Combination of 
Observations, 404 

Burnside, W., refs , Theory of Proba- 
bility, 406 

Camp, B H , refs , correlation, 393 , 
mtegrals for pomt bmonual and 
hypergeometric senes, 395, sam- 
pling, 398 

Cave, Beatrice M , correlation differ- 
ence method, 198 , refs , 208 

Cave-Browne-Cave, F. E , correla- 
tion difference method, 198 , refs , 
208 

Census (England and Wales), tabu- 
lation of infirmities in, 14-15, 
data as to infirmities cited from, 
32-33 , classification of occupa- 
tions, as example of a hetero- 
geneous classification, 72 , classi- 
fication of ages, 80, and refs , 105 , 
data as to ages of husbands and 
wives cited from, 159 

Chaddock, R E , refs , Principles 
and Methods of Statistics, 406 

Chance, m sense of complex causa- 
tion, 30 , of success or failure of 
an event, 256 

Chances, law of small, 265-266, 
366-370 , refs , 273, 394 

Charher, C V L , refs , theory of 
frequency curves, resolution of a 


compound normal curve, 314, 315, 
395 

Chehysheff, P L , refs , fittmg poly- 
nomials {see Isserhs, L ), 393 , 
means, 397 , mequahty, 398 
(under Camp), 400 (under Smith, 
C D) 

Childbirth, deaths m, apphcation of 
theory of samphng, 282-284. 

Chokhate, J. See Shohat, J 

Cholera and moculation, illustra- 
tions, 31-32, 34-35, 382-384. 

Christidis, B. Gr., refs , yield trials, 
403. 

Chrysomehdce, distribution of size of 
genus, 363-364 

Church, A E R , refs , probable 
errors, 398, 399 

Clapham, A R , refs , yield trials, 
403. 

Class, in theory of attributes, 8 , 
class symbol, 9 , class frequency, 
10 , positive and negative classes, 
10 , ultimate classes, 12 , order of 
a class, 10. 

Classification, generally, 8 , by di- 
chotomy, def , 9 , manifold, 60- 
74, 76 , homogeneous and hetero- 
geneous, 71-72 , of a variable for 
frequency-distribution or corre- 
lation table, 76, 80-81, 157, 164. 

Class-interval, def , 76 ; choice of 
magmtude and position, 79-80, 
362-363 , desirabihty of equahty 
of intervals, 76, 82-83 , influence 
of magmtude on mean, 113-114, 
115, 116 , on standard deviation, 
140, 212 

Cloudiness at Breslau, frequency- 
distribution, 103 , diagram, 104 

Coefficient, of association, 37-39 , of 
contmgency, 64-67 , of variation, 
149, standard error, 351 , refs , 
distribution m samphng, 400 , of 
correlation, see Correlation 

Colhns, S. H , refs , agricultural ex- 
periments, 401 

Colours, nammg a pair, example of 
contmgency, 379-380 

Connor, R. L ^ee Bowley 

Consistence, of class -frequencies for 
attributes, generally, 17-24 , def , 
18-19 , conditions, for one or two 
attributes, 20 , for three attri- 
butes, 21-22 , refs , 23 

27 * 



m 


THEORY OP STATISTICS. 


Consistence of correlation-coeffi- 
cients, 250-251. 

Contmgeney tables, def , 60 ; treat- 
ment of, by elementary methods, 
61^63 , isotropy, 68-71, 328-331, 
testing of divergence from inde- 
pendence, 378-380 

— coefficient of, 64-67 ; application 
to correlation tables, 167, (qu 3) 
189 , standard error of (refs ), 
355, 397, 399 , partial or multiple 
contingency (refs ), 390. 

Contrary classes and frequencies (for 
attributes), 10 ; case of equality 
of contrary frequencies (qu 6, 7, 
8), 16 ; (qu. 8), 24 , (qu. 7, 8, 9), 
59 

Correction of death-rates, etc , for 
age and sex-distribution, 223- 
225 , refs , 226, 392 

— of standard-deviation for group- 
ing of observations, 211-212 , refs 
(including correction of moments 
generally), 225. 

Correction of correlation-coefficient 
for errors of observation, 213- 
214 , refs , 225-226, 392 

Correlation, generally, 157-253, con- 
struction of tables, 164 ; represen- 
tation of frequency-distribution 
by surface, 165-167 , tieatmentof 
table by coefficient of contingency, 
167 , con elation-coefficient, 170- 

174, def. 174, direct deduction, 
231-233 , regressions, 175-177, 
direct deduction, 365-366, def 

175 , standard-deviations of 
arrays, 177, 204, 205 , calculation 
of coefficient for ungrouped data, 
177-181, for a grouped table, 181- 
188 , between movements of two 
variables, difference method, 197- 
199, fluctuation method, 199-201 ; 
refs., 208-209, 360, 392, 393, 401 , 
elementary methods for cases of 
non-hnear regression, 201-202 , 
rough methods for estimating co- 
efficient, 202-204 , correlation- 
ratio, 204-207, 252 ; effect of 
errors of observation on the co- 
efficient, 213-214 ; correlation 
between indices, 215-216 , co- 
efficient for a fourfold table, 
direct, 216-217, on assumption of 
pormal correlation (Pearson’s co- 


efficient) (refs ), 40, 333, 390 , for 
all possible pairs of N values, 217- 
218 , correlation due to hetero- 
geneity of material, 218-219 , 
effect of adding uncorrelated pairs 
to a given table, 219-220 , appli- 
cation to theory of weighted mean, 
221-223 ; correlation m theory of 
sampling, 271, 286-289, 342, 349- 
350 , standard erior of coefficient, 
352 Refs, 188, 208-209, 225- 
226, 390, 391, 392, 393, 397, 398, 
399, 400, 401, 406 For Illustra- 
tions, Normal, Partial, Ratio, see 
below 

Correlation, Illustrations and Ex- 
amples, correlation between — 

Two diameters of a shell (Pec- 
ten), 158 , constants (qu 3), 189. 

Ages of husband and wife, 159 , 
diagram, 173 , constants (qu 3), 
189 

Statures of father and son, 160 , 
diagrams, facing 166, 174 , con- 
stants (qu. 3), 189 , correlation- 
ratios, 206-207 , testing normality 
of table, 322-328 , diagiam of dia- 
gonal distiibution, 325 , of con- 
tour-lmes fitted with ellipses of 
normal surface, 327 

Fertility of mother and daugh- 
ter, 161, 195-196 , diagram, 175 , 
constants (qu 3), 189. 

Discount rates and percentage 
of reserves on deposits, 162 , dia-* 
gram, facing 166 

Sex-ratio and numbers of births 
m different districts, 163, 175 , 
diagram, 176 ; constants (qu 3), 
189 , correlation - ratios, 207 . 
standard-deviations of arrays and 
comparison with theory of sam- 
pbng, (qu 7) 275 and (qu 1) 
289. 

Earnings of agricultural labour- 
ers, pauperism and out-relief, 177- 
181 , constants, (qu 2) 189, 239 , 
correlation-ratios, 207 , treatment 
by partial correlation, 239-241 , 
geometrical representation, 245- 
247 

Old-age pauperism and out- 
rehef, 182-185. 

Changes m pauperism, out- 
relief, propoitmn of old and popu- 



INDEX. 


421 


lation, 192-195; partial correla- 
tion, 241-245 

Lengths of mother- and 
daughter-frond m Lemna mimr^ 
185-187. 

Weather and crops, 196-197 

Movements of infantile and 
general mortality, 197-199 

Movements of marriage -rate 
and foreign trade, 199-201 
Corielation, normal, 317-334, de- 
duction of expression for two 
variables, 318-319 , constancy of 
standard-deviation of arrays and 
hnearity of regression, 319-320 , 
contour hnes, 320-321 , normahty 
of hnear functions of two nor- 
mally distributed variables, 321 , 
principal axes, 321-322 , testmg 
for normality of correlation table 
for stature, 322-328 , isotropy of 
normal correlation table, 328-331 , 
outline of theory for any number 
of variables, 331-332 , coeffilcient 
for a normal distribution grouped 
to fourfold form round medians 
(Sheppard’s theorem), (qu 4) 334 , 
applications to theory of quali- 
tative observations (refs ), 333. 
Refs , 332-333, 390, 391, 397. 

— partial, 229-253 , the problem, 
partial regressions and correla- 
tions, 229-231 , direct deduction, 
365-366 , notation and defini- 
tions, 233-234 , normal equa- 
tions, fundamental theorems on 
product-sums, 234-235 , signi- 
ficance of generabsed regressions 
and correlations, 236 ; reduction 
of standard-deviation, 236-237, of 
regression, 237-238 ; of correla- 
tion, 238 , arithmetical treatment, 
238-245 , representation by a 
model, 245-247 , coefficient of 
7i-fold correlation, 247-249 , ex- 
pression of correlations and regres- 
sions in terms of those of higher 
order, 249-250 , consistence of co- 
efficients, 250-251 , fallacies, 251- 
252, limitations m mterpretation of 
the partial correlation-coefficient, 
partial association and partial cor- 
relation, 252 , partial correlation 
m ease of normal distribution of 
frequency, 331-332 , refs , 252- 


253, 332-333, 393, 394, 397, 
398 

Correlation ratio, 204-207 ; standard 
error, 352 ; refs , 209, 398, 400 ; 
partial, 253, and refs., 252, 393, 
398, 400 

Cosin, values of estates m 1715, 100. 

Cost of hvuig, refs , 390-391. 

Cotsworth, M B , refs , multiphca- 
tion table, 358 

Cournot, A A., refs , theory of prob- 
abihty, 361. 

Craig, C C , refs , sampling, 399, 
400 

Cramer, H , refs , series used in 
mathematical statistics, 395 ; 
theory of error, 395 

Crawford, G. E , refs , proof that 
arithmetic mean exceeds geo- 
metric, 130. 

CreUe, A. L., refs , multiplication 
table, 358 

Crops and weather, correlation, 196- 
197 

Crum, L W , refs , Economic Statis- 
tics, 405 

Cunnmgham, E., ref , omega-func- 
tions, 314 

Czuber, E , refs , Wahrscheinhch- 
keitsrechnung, 361 , Die stahs- 
tische ForscJiungsmethode, 404 

Daebishibe, a D , data cited from, 
128, 265 , refs , illustrations of 
correlation, 188, 273. 

Darmois, G , refs , time senes, 393 , 
Statistique Mathemahque, 406 

Darwin, Charles, data cited from, 
269-270. 

Datura, association between colour 
and pnckhness of fruit, 37, 38, 
(qu 10) 275, 380-381 

Davenport, C B , data as to Pecten 
cited from, 158 Refs , statistical 
tables, 358 

Day, E E , refs , Statistical Analysis, 
405 

Deaf -mutism, association with im- 
becihty, 33-34, 38 , frequency 
amongst offsprmg of deaf-mutes, 
table, 104 

Deaths, death-rates, association with 
occupation (partial correction for 
age-distribution), 52-53 , m Eng- 
land and Wales, 1881-1890, table, 



422 


THEOEY OE STATISTICS. 


77 , from diphtheria, table, 98, 
diagram, 97 , infantile and gene- 
ral, correlation of movements, 
197-~199 , standardisation of, for 
age and sex-distnbntion, 52-53, 
223-225, icfs , 226, 392 , applica- 
tions of theory of sampling — 
deaths fiom accident, 265-266, 
deaths in childbirth, 282-284, 
deaths from explosions in mines, 
287-288 , inapplicability of the 
theory of simple samphng, 260- 
261, 282-284, 285-286, 287-288 , 
criteria (lefs ), 390 

Deciles, 150-152 , standard error of, 
337-341 

Defects . in school children, associa- 
tion of, 12, 45-46, refs , 15 , cen- 
sus tabulation of, 14-15 

De Morgan, A , refs , Formal Logic, 
23 , Theory of Probabilities, 361 

Detlefsen, 3 A , refs , fluctuations 
of samphng in Mendelian popula- 
tion, 394 

Deviation, mean, 134 , generally, 
144-147 , def , 144 , is least round 
the median, 144-145 , lefs , 154 , 
calculation of, 145-146, (qu 7) 
155-156 , coinpanson of advan- 
tages with standard-deviation, 
146, of magnitude with standaid 
deviation, 146-147 , of noimal 
cuive, 304 

Deviation, quartile See Quartiles 

— loot-mean-squaie See Devia- 
tion, standard 

— standard, 134-144 , def , 134 , 
relation to lOot- mean-square de- 
viation from any origin, 134-135 , 
is the least possible loot-mean- 
square deviation, 135 , little 
affected by small errors m the 
mean, 135 , calculation for un- 
grouped data, 135-137, for a 
grouped distiibution, 138-141 , 
influence of grouping, 140, 211- 
212 , lange of six times the s d 
contains the bulk of the observa- 
tions, 140-142, 309 , of a series 
compounded of others, 142-143 , 
of N consecutive natural numbers, 
143 , of rectangle, 143 , of arrays 
m theory of correlation, 177, 204, 
205, 319-320 , of generalised de- 
viations (arrays), 234, 236-237 , 


other names for, 144 , of a sum 
or difference, 210-211 , effect of 
eirois of observation on, 211 , of 
an index, 214-215 , of binomial 
series, 299-300 , of law of small 
chances, 366-370 Eoi standard- 
deviations of samphng, see Error, 
standard 

De Vries, H , data cited from, 102 
Dice, lecords of throwing, 258-259, 
(qu 1, 2, 3) 274, 371 , testing for 
significance of divergence fiom 
theoiy, 267, 373-376 , lefs , 273 
Dickson, J D Hamilton, normal 
coil elation surface, 328 Refs , 
normal correlation, 333 
Diffeience method m correlation, 
197-199 , refs , 226, 252, 392-393 
Diphtheria, ages at death from, 
table, 98 , diagram, 97 
Discounts and reserves m American 
banks, table, 163 , diagram, facmg 
166 

Dispersion, measures of, 107, 133- 
150 , unsuitability of range as 
a measure, 123 , relative, 149 , 
lefs , 154 See Deviation, mean , 
Deviation, standard , Quartiles 
Disti ibution of Frequency See 
Fr equency- distrib ution 
Dodd, E L , lefs , frequency curves, 
395 , samphng, 397, 398 
Doodson, A T , lofs , mode, median, 
and mean, 390 

Duckweed, correlation between, 
mother- and daughter -frond, 185- 
187 

Duffell, J H , ref., tables of gamma- 
function, 358 

Dunckei, G , relation between geo- 
metric and authmetic mean (qu 
9), 156. 

Earnings of agricultural labourers 
calculation of standard-deviation, 
135-137 , mean deviation, 145 , 
quartiles, 147 , con elation with 
pauperism and out-relief, 177-181, 
constants, (qu 2) 189, 239 , dia- 
gram, 180 , by partial correlation, 
239-247 , diagiam of model, 246 
Eden, T , lefs , yield trials, 402 , 
with tea, 403 

Edgeworth, F Y , dice-throwings 
(Weldon), 258 , probable error of 



INDEX 


423 


median, etc , 344 Refs , Index- 
numbers, 130-131, 391 , correla- 
tion, 188, 252, 333 , law of error 
(normal law) and frequency- 
curves generally, 273, 314, 395 , 
theory of samplmg, probable 
errors, etc , 273, 354 , dissection 
of normal curve, 315 
Elderton, E M , refs , variate differ- 
ence method, 392 , noimal curve 
tables, 395 , sampling, 399 
Elderton, W Palm, refs , calculation 
of moments, 154 , table of powers, 
358 , tables for testmg fit, 354, 
358 , Frequency Curves and Gor- 
r elation, 154, 361, 404 
Engineering, apphcations of statis- 
tical method, refs , 404 
Engledow, E L , refs., yield trials, 
402 

Epidemiology, applications of statis- 
tical method to, refs , 396 
Error, law of , errors, curve of See 
Normal curve. 

— mean, 144. 

— mean square, 144 

— of mean square, 144 

— probable, m sense of semi-inter- 
quartile range, 147 , m theory of 
samphng, 310-311 For general 
references, see Error, standard 

— standard, def , 267 , of number 
or proportion of successes in n 
events, 256-257 , when numbers 
m samples vary, 264-265 , when 
chance of success or failure is 
small, 265-266 ,* of percentiles 
(median, quartiles, etc ), 337-341 , 
of arithmetic mean, 344-350 , of 
standard-deviation and coefficient 
of variation, 351 , of coefficients 
of correlation and regression, 352 , 
of correlation-ratio and test for 
Imearity of regression, 352 , refs , 
273, 289, 354-355, 397-401 See 
also Samphng, theory of 

— theory of. See Sampling, theory 
of 

Estates, annual value of See Value 
Everitt, P F , refs , tables for calcu- 
latmg Pearson’s coefficient for a 
fourfold table, 358 
Exclusive and inclusive notations for 
statistics of attributes, 14-15 
Explosions m coal-mines, deaths 


from, as illustrating theory of 
sampling, 288 

Eye-colour, association between 
father and son, 34-35, 38, 70-71 , 
association between grandparent, 
parent, and child, 46-48, 53-54 , 
contingency with hair-colour, 61, 
63, 66-68 , non-isotropy of con- 
tingency table for father and son, 
70-71 

Ezekiel, Mordecai, refs , correlation, 
393, 394 , samplmg, 400 , Methods 
of G 01 relation Analysis, 406 

Falkner, R P., refs , translation of 
Meitzen’s Theorie der StatishL, 6. 

Fallacies, m mterpreting associations 
— ^theorem on, 48-49, illustrations, 
49-51 , owing to changes of classi- 
fication, actual or virtual, 72 , in 
mterpretmg correlations — “ spuri- 
ous ” correlation between indices, 
215-216 , correlation due to 
heterogeneity of material, 218- 
219 , difference of sign of total 
and partial correlations, 251-252. 

Fay, E A , data cited from Ma7 - 
riages of the Deaf in America, 104 

Fechner, G T , refs , frequency-dis- 
tributions, averages, measures of 
dispersion, etc , 129, 154 , Kol- 
lektivmasslehre, 129, 314, 361 

Fecundity of brood-mares, table, 96 , 
diagram, 94 , mean, median, and 
mode, (qu 3) 131 ; mheritanoe 
(ref ), 208, 226 

Feeding trials, errors in, refs ,401- 
404 

Fertility of mother and daughter, 
correlation, 161, 195-196 ; dia- 
gram, 175 , constants, (qu 3) 189, 
ref , 208, 226 

Field trials, errors m, ref , 401-404 

Filon, L. N G , ref , probable errors, 
354 

Fisher, A ,refs , Mathematical Theory 
of Probabilities, 404. 

Fisher, Irving, refs , index-numbers, 
390, 391 

Fisher, R. A , use of term 
“ variance,” 144 , testing good'* 
ness of fit, 378, 387 , refs , good- 
ness of fit, 396, 397 , of regression 
lines, 391 , errors of samplmg m 
correlation-coefficient, 354, 397, 



424 


THEORY OR STATISTICS. 


399 , probable errors, 397, 398, 
399, 400 , extremes of sample, 
399 , yield trials, 402, 403 , 
Statistical Methods for Research 
Workers^ 405. 

Eit of a theoretical to an actual 
frequency - distribution, testmg, 
generally, 370-389 , comparison 
frequencies given a priori, 370- 
378 , cautions, 373-376 , expeii- 
mental illustration, 377-378 , com- 
parison frequencies based on the 
observations, 378-389 , contin- 
gency tables, 378-380 , associa- 
tion tables, 380-383 , aggregate 
of tables, 383-384 , experimental 
illustrations, 384-387 , P-table 
for use vpith association tables, 
388-389 , refs , 315, 391, 396-397, 
tables for, 368 

Fluctuation, measure of dispersion, 
144 

Flux, A W , refs , measurement of 
price-changes, 390 

Forcher, H,, refs , Die statisfische 
Methods als selbstandige Wissen- 
scJiaft, 404. 

Fountam, H , ref , mdex-numbeis of 
prices, 131 

Frequency of a class, 10, 76 

Frequency-curve, def , 87 , ideal 
forms of, 87-105 ; normal curve 
(qv), 301-313, refs., 105, 314, 
394-396 

Frequency- distributions, 76 , forma- 
tion of, 79-83 , graphic represen- 
tation of, 83-87 , ideal forms — 
symmetrical, 87-90, moderately 
asymmetrical, 90-98, extremely 
asymmetrical (J-shaped), 98-102, 
363-364, U^shaped, 102-106 , bi- 
nomial series, 291-300 , hyper- 
geometrical senes (ref ), 289 , nor- 
mal curve, 301-313 , theoretical 
forms, refs , 289, 314, 394-396 , 
testing goodness of fit, 373-376 
See Binomial senes ; Normal 
curve , Correlation, normal 

— illustrations of death-rates in 
England and Wales, 77 , of ages 
at death of certain women, 78 , of 
stigmatic rays on poppies, 78 ; of 
annual values of dwelling-houses 
m Great Bntam, 83 , of head- 
breadths of Cambridge students. 


84 ; of statures of males m the 
UK, 88, 90 , of pauperism in 
different districts of England and 
Wales, 93 , of weights of males in 
the U K , 95 ; of fecundity of 
brood-mares, 96 , of barometer 
heights at Southampton, 96 , of 
ages at death from diphtheria, 98 , 
of annual values of estates, 100 , 
of petals m Ranunculus bulbosus, 
102 , of degrees of cloudiness at 
Breslau, 103 ; of percentages of 
deaf-mutes m offsprmg of deaf- 
mutes, 104 ; sizes of genera 
(Chrysomehdee), 364 See also 
Correlation, illustrations and 
examples 

Frequency-polygon, construction of, 
84 

Fiequency-surface, forms and ex- 
amples of, 164-167 , diagrams, 
166, facing 166 ; normal, diagram, 
166 See Correlation, normal 

Frisch, Ragnar, refs , correlation, 
391 , time senes, 393 

Fry, T, C , refs , Probability and its 
Engineering Uses, 404 

Gabaglio, a , ref , Teona generals 
della statishca, 6. 

Galloway, T , ref , Treatise on Prob- 
ability, 361 

Galton, Sir Fiancis, Hereditary 
Genius, 3 , frequency-distribution 
of consumptivity, 104 , grades 
and percentiles, 150, 152 , regies- 
sion, 176, Galton’s function (cor- 
relation - coefficient), 204 ; bi- 
nomial machine, 299 , normal 
correlation, 328 , data cited from, 
34, 46, 70 Refs , geometric mean, 
130 ; percentiles, 164 , correla- 
tion, 188, 332 , correlation be- 
tween indices, 226 , binomial 
machine, 313 ; Natural Inherit- 
ance, 154, 313, 332 

Gamma functions, tables, refs , 
401 

Gauss, C F , use of term “ mean 
error,” 144. Refs., normal curve, 
314 , method of least squares, 361. 

Geary, R C , refs , frequency distri- 
butions, 395. 

Geiger, H , refs , law of small 
chances, 269. 



iNDBi 426 


Geometric mean See Mean, geo- 
metric 

Geometric (logarithmic) mode, 128 
Gibbs, J Willard, Principles of 
Statistical Mechanics, 4 
Gibson, Wmifred, refs , Tables for 
computmg probable errors, 354, 
358 

Gmi, 0 , refs , index-numbers, 391 
Goodness of fit, generally, 370-389 ,* 
refs , 391, 396-397. ^ee also Tit 
Grades, 152, 153 

Graphic method, of representmg 
frequency-distributions, 83-87 , of 
interpolation for median or per- 
centiles, 118, 151-152 , of repre- 
sentmg correlation between two 
variables", 180-181 , of estimatmg 
correlation - coefficient, 203-204 , 
of formmg one binomial polygon 
from another, 295-297 
Graunt, John, ref , Observations on 
the Bills of Mortality, 6 
Gray, John, data cited from, 270 
Greatest and least values of sample, 
refs , 398 (Dodd), 399 (Fisher and 
Tippett) 

Greenwood, M , refs , index correla- 
tions, 226, 252 ; errors of sam- 
pling (small samples), 289, 398, 
inoculation statistics and associa- 
tion, 40 , apphcation of law of 
small chances, 394 , multiple 
happenmgs, 396. 

Grm&ey, H S , refs , errors of feed- 
ing trials, 402 

Groupmg of observations to form 
frequency- distribution, choice of 
class-mterval, 79-80 , mfluence 
on mean, 113-114, 115, 116 , in- 
fluence on standard-deviation, 
140, 212. 

Grubb, N H , refs , error m currant 
trials, 402. 

Gumbel, E. J , refs , spurious corre- 
lation, 392 


Haik-coloue : and eye-colour, ex- 
ample of contmgency, 61-63, 66- 
67 , non-isotropy, 68, 69 , theory 
of sampling apphed to certain 
data, 270-271, 272 
Hall, A D , refs , errors of agri- 
cultural experiment, 401, 402 


Hall, Phibp, refs , partial correlation, 
394 , probable errors, 398 
Harmomc analysis, samphng, refs., 
399 {see Fisher, R A ) 

Harmomc mean Bee Mean, har- 
monic 

Harper, T H , refs , Practical 
Statistics, 406 

Hams, J A , refs , short method of 
calculatmg coefficient of correla- 
tion, 209 , intra-class coefficients, 
209 , correlation, miscellaneous, 
392 , error in field experiments, 

401. 

Hart, B , refs , effect of errors on 
correlation, 392 

Hatton, R. G , refs , error in currant 
trials, 402. 

Hayes, H K , refs., variety trials, 

402, 403 

Head-breadths of Cambridge stu- 
dents, table, 84 , diagram, 85 
Helguero, P. de, refs , dissecting 
compound normal curve, 315 
Henry, A , refs , Calculus and Prob- 
ability, 404 

Heron, D , refs , association, 40 , re- 
lation between fertility and social 
status, 208, defective physique 
and intelligence, apphcation of 
correction for age-distnbution, 
etc , 226 , abac giving probable 
errors of correlation - coefficient, 
354, 358 , probable error of a 
partial correlation -coefficient, 354 
Histogram, construction of, 84 
History, refs , of statistics generally, 
5-6, 390 , of correlation, 188, 391 , 
of normal curve, 395 
Hoblyn, T L , refs , horticultural 
experiment, 403 

HoUis, T , cited re Cosin’s Names of 
the Roman Catholics, etc , lOO 
Holzinger, K S , refs , samphng, 399 
Hooker, R H , correlation between 
weather and crops, 196 , between 
movements of two variables, 200, 
201 Refs., correlation between 
movements of two variables, 208 , 
weather and crops, 208, 253 , 
theory of partial correlation, 252 
Horticulture, errors in, refs., 401- 
404 

Hotelhng, Harold, refs , history, 
390 , Analysis Situs, 393 , time 



426 


theortT op statistics. 


series, 393 , prob^ible errors, 398, 
400 

Houses, inhabited and uninhabited, 
m rural and urban districts, 61-62, 
annual value of, table, 83, median, 
(qu 4) 131 , quartiles, (qu. 3) 155, 
Hubback, J. A , refs , rice trials, 
402. 

Hudson, H. P , lefs , frequency- 
curves (epidemiology), 396. 

Hull, C H , ref , The Economic 
Writings of Sir WiUiaiii Petty, 
together wrih the observations on 
the Bills of Mortality more probably 
by Captain John Graunt, 6. 
Husbands and wives, correlation be- 
tween ages, table, 159 , diagram, 
173 , constants, (qu 3) 189 
Hypergeometrical Series, ref , 289 

Illusory associations, 48-51. 
Imbecihty, associations with deaf- 
mutism, 32-33, 38 
Immer, P. R , lefs , field trials, 403 
Inclusive and exclusive notations for 
statistics of attributes, 14-15 
Independence, criterion of, for attii- 
butes, 25-28 , case of complete, 
for attributes, 56-57 , form of 
contingency or correlation table 
in case of, 71 , goodness of fit test 
for, 378-387 

Index -numbeis of prices, def , 126 , 
use of geometric mean for, 126- 
127 , use of harmonic mean, 129 , 
refs , 130-131, 390-301 
Indices, correlation between, 215- 
216 , refs , 226, 252, 392 
Infirmities, census tabulation of, 
14-15 , association between deaf- 
mutism and imbecility, 32-33, 38. 
Inoculation, cholera, examples, 31- 
32, 34-35, 382-384 
Intermediate observations, in a 
frequency-distribution, classifica- 
tion of, 80-81, 362-363 ; in corre- 
lation table, 164 

Irwin, J 0 , refs., analysis of vari- 
ance, 394 , goodness of fit, 396 , 
probable errors, etc , 398, 399, 
400 , recent advances, 401 
Isotropy, def , 68 , generally, 67-71 , 
of normal correlation table, 328- 
331 , refs , 73. 

Isserhs, L., refs., partial correlation- 


ratio, 252, 393 , conditions for 
real significance of probable errors, 
354 ; fitting polynomials (Cheby- 
shefi), 393 , probable error of 
mean, 397 , small samples (see 
Greenwood), 398 

Jacob, S. M , ref., crops and rainfall, 
208, 226 

Jefiery, G B , refs., sampling, 399 

Jevons, W Stanley, use of geometric 
mean, 127 Refs , system of 
numerically definite reasoning 
(theory of attributes), 15 , index- 
numbeis, 130, Pwe Logic and 
other Minor Works, 15 , Investiga- 
tions in Currency and Finance, 
130 

Johannsen, W , lefs , Elemente der 
exahten Erblichteitsleht e, 361 

John, V , refs , Geschichte der Sta- 
tistih, 5 

Jones, I) C , refs., A First Course in 
Statistics, 404 

Jordan, 0., refs , time senes, 393 , 
Staiistique Mathhnatique, 406 

Jorgenson, M , refs , agricultural 
experiment, 403 

J-shaped frequency - distributions, 
98-102, 363-364 

Julm, A , refs., Pnncipes de Statis- 
tique, 405 

Kapteyn, j C , refs , Skew Fre- 
quency-curves in Biology and Sta- 
tistics, 130, 314 

Kelley, T L , refs., correlation, 393, 
394 , Statistical Method, 405. 

Keynes, J M , refs., A Treatise on 
Probability, 405. 

Kick of a horse, deaths from, follow- 
ing law of small chances, 265-266, 
369-370 

Kindermann, M , refs , yield trials, 
403 

King, George, refs , giaduation of 
age statistics, 105 

Knibbs, G. H., refs , price index- 
numbers, 390 , frequency- curves, 
396 

Knight, R 0., refs , error m currant 
trials, 402 

Kohlweiler, E , refs , Statistik im 
Dienste der Technik, 404. 



INDEX 


427 


Kotin, S , refs , Theory of Statistical 
Method^ 406 

Kondo, T , refs , normal curve 
tables, 395 , samplmg, 399, 400 
Koren, J , refs , History of Statistics, 
390 

Labour Gazette, Index Number, refs , 
391. 

Labourers, earmngs of agricultural 
* See Earnings 

Laplace, Pierre Simon, Marq^uis de, 
probable error of median, 344 
Refs , normal curve, 314 , mean 
deviation least about the median, 
154 , Theone anaXytique des proba- 
biUth, 154, 354, 361 , Essai philo- 
sophique, 361, 405 

Larmor, Sir J , use of word ‘‘ statis- 
tical,’* 4 

Lee, Alice, data cited from, 96, 122, 
160, 161 Refs , mhentance of 
fertility and fecundity, 208, 226 ; 
tables of functions, 358, 359 
Lemna minor, correlation between 
lengths of mother- and daughter- 
frond, 185-187 

Lexis, W , use of term precision,” 
144 Refs , Theone der Massen- 
erscheinungen, 273 , Abhandlungen 
zur Theone der Bevolkerungs- und 
Moralstatistilc, 273, 361 
Lmearity of regression, test for, 
205-206, 352 , refs , 391 See also 
Correlation-ratio 

Lipps, G F , refs , measures of 
dependence (association, correla- 
tion, contingency, etc ), 40 , 

Fechner’s Kollehtivmasslehre, 129, 
360 

Little, W , data as to agricultural 
labourers’ earmngs cited from, 137 
Lloyd, W E , refs , error m soil 
surveys, 402 

Lobelia, application of theory of 
samplmg to certain data, 269-270, 
272 

Logarithmic mcrease of population, 
125-126 , logarithmic mode, 128 
Lord, L , refs , nco trials, 402 
Lyon, T L , refs , errors of agri- 
cultural experiment, 402 

Maoalister, Sir Donald, ref., law 
of geometric mean, 130, 314 


Macaulay, F G , refs , smoothing 
time senes, 393 

Macdonell, W R , data cited from, 
84, 90 

March, L , refs , correlation, 208 , 
mdex -numbers, 391 , Les Prin- 
cipes de la Methods Statistique, 406 

Marriage-rate and trade, correlation 
of movements, 199-201. 

Marshall, A , ref , Money Credit and 
Commerce, 391 

Maskell, E J , refs , experimental 
error in agriculture, 403 , sugar 
cane, 403 

Maxwell, Clerk, use of word “ sta- 
tistical,” 4 

McKay, A T , refs , sampling, 400 

McNemar, Q , refs , partial correla- 
tion, 394 

Mean, arithmetic, generally, 108- 
116, def, 108-109, nature of, 
109 , calculation of, for a grouped 
distribution, 109-113, influence 
of grouping, 113-114, 115, 116, 
position relatively to mode and 
median, 121-122, (refs ) 390 , dia- 
grams, 113, 114, sum of devia- 
tions from, IS zero, 114 , of series 
compounded of others, 115, of 
sum or difference, 115-116 , com- 
panson with median, 119 ; sum- 
mary comparison with median and 
mode, mean is the best for all 
general purposes, 122-123, weight- 
ing of, 220-225 ; of bmomial 
senes, 299 , of law of small 
chances, 369 , standard error of, 
334-350, (refs ) 355, 397-401 

— deviation See Deviation, mean 

— error, 144 See Error, standard 
Deviation, standard 

— geometric, 108 , generally, 123- 
128 , def , 123 , calculation, 124 ; 
less than arithmetic mean, 123 ; 
difference from arithmetic mean 
in terms of dispersion, (qu 8) 156 ; 
of senes compounded of others, 
124 , of series of ratios or pro- 
ducts, 124 , in estimating inter- 
censal populations, 125-126, con- 
venience for mdex -numbers, 126- 
127 , use on ground that devia- 
tions vary with absolute magni- 
tude, 127-128 , weighting of, 225 

— harmomo, 108 ; generally, 128- 



m 


THEORY OE STATISTICS. 


129 , def , 128 ; calculation, 128 , 
IS less than arithmetic and geo- 
metric means, 129 , difference 
from arithmetic mean in terms of 
dispersion, (qu 9) 156 , use m 
averaging pnces if mdex-numbeis, 
129 , m theory of sampling, when 
numbers m samples vary, 264- 
265 

Mean square error, 144 

— weighted, 220-225 , def , 220 , 
difference between weighted and 
unweighted means, 221-223 , ap- 
plication of weighting to correc- 
tion of death-rates, etc , for age- 
and sex-distribution, 223-225 , 
refs , 226, 392 

Median, 108, generally, 116-120, 
def , 116 , indeterminate m cer- 
tam cases, 116-117, nnsuited to 
discontinuous observations and 
small senes, 116-117 , calculation 
of, 117 , graphical deteimmation 
of, 118, comparison with arith- 
metic mean, 119 , advantages m 
special cases, 119-120 , slight in- 
fluence of outlying values on, 120 , 
position relatively to moan and 
mode, 121-122, diagiams, 113, 
114, (refs ) 387 , weighting of, 
225 , standard error of, 337-341, 
(refs ) 354 

Meidell, H B., refs , sampling, 398. 

Meitzen, P- A., refs., OescMchte, 
Theone und Techmk d&r 8ta- 

tistikf 6 

Mendehan breeding experiments as 
illustrations, 37, 38, 128, 264-265, 
267-268 , refs., fluctuations of 
sampling in, 273, 394. 

Mercer, W. B , refs., errors of agn- 
cnltural experiment, 402. 

Methods, statistical, purport of, 3-5 ,* 
def , 5 

Mice, numbers m litters, harmonic 
mean, 128-129 , proportions of 
albinos m litters, fluctuations 
compared with theory of sam- 
plmg, 264-265 

Migration, random, refs , 396 

Milk-testmg, errors m, refs , 401 

Milton, Jolm, use of word “ statist,’* 

I 

Miner, J R , correlation, ref , 393 

Mises, R von, refs , WahrscheinUch- 


teit, Statistik und Wahrheiff 406 ; 
Wahrschcin hckkeitsrechnung, 407 
Mitchell, H H., lefs , errors of feed- 
ing trials, 402 

Mitscheilich, E A , lefs , yield trials, 
403 

Mode, 108 , generally, 120-123 , 
def, 120, appioxiraate detei- 
mmation, fiom mean and median, 
121-122, diagrams showing posi- 
tion relatively to mean and 
median, 113, 114, logarithmic 
or geometric mode, 128 ; weight- 
ing of, 225 , refs , 130, 390 
Modulus as measure of dispersion, 
144 , origin from normal curve, 
304 

Mohl, Robert von, refs , Geschichte 
U7id Liteiatur der Statswissen- 
schaf telly 5 

Moir, H , refs., fiequency-curves 
(mortality), 396 

Molma, E C , refs., Bayes’ Theorem, 
401. 

Moller-Arnold, E , refs., field trials, 
403 

Moment, first, def , 110 , second and 
general, def , 135 ; calculation of 
moments, (ref ) 154 ; errors of 
sampling, 354-356, 397-401 
Montessus de Ballore, R de, refs , 
Probabihies et StatisHqueSy 407 
Moore, L Bramley, data cited from, 
96, 161. Refs., inheritance of fu- 
tility and fecundity, 208, 226 
Morant, G , refs , law of small 
chances, 394 

Mortality. See Death-rates 
Movements, correlation of, in two 
variables, methods, 197-201 ; refs , 
208, 392-393 

Negative classes and attributes, 10. 
Newbold, Ethel M , refs , frequency- 
distributions, accidents, 396 
Newsholme, A,, refs., birth-rates, 
correction for age-distribution, 
etc , 226 , Vital Statistics, 359, 406 
Neyman, J , refs , goodness of fit, 
397 ; probable errors, 398, 399, 
400 ; yield trials, 403 
Niceforo, A , refs.. La Mithode Sta- 
iistique, 405 

Nixon, J, W., refs , experimental 
test of normal law, 314. 



INDEX. 


429 


Normal curve of errors , deduction 
from bmomial senes, 301-302 , 
value of central ordinate, 304 , 
table of ordinates, 303 , mean 
deviation and modulus, 304 , com- 
parison with binomial senes for 
moderate value of ?i, 304-305 , 
outlme of more general methods of 
deduction, 305-307 , fitting to a 
given distribution, 307-308, the 
table of areas, 310, and its use, 
309-310 , quartile deviation and 
probable error, 310-311 , numeri- 
cal examples of use of tables, 311- 
313 , normahty in fluctuations of 
sampling of the mean, 346-347. 
Befs , general, 314 , dissection of 
compound curve, 315 , tables, 
358-359, 395, 401 ; history, 395 
For normal correlation, see Corre- 
lation, normal 

Norton, J P , data cited from, 162 
Bef , Statistical Studies in the New 
Yorh Money Market, 208 

NyboUe, H C , refs , Theone der 
Statistik, 406 

O’Beien, D G , refs , errors m feed- 
mg experiments, 401 

Order, of a class, 10 , of generalised 
correlations, regressions, devia- 
tions, and standard deviations, 
233-234. 

Palqrave, Sir R H I , Dictionary 
of Political Economy, 6 

Papadakis, J , refs , yield trials, 403 

Pareto, V , refs , Gours d'" economic 
politique, 105 

Partial association, ^ee Association, 
partial 

— correlation See Correlation, 
partial 

Patton, A C , refs , Economic Sta- 
tistics, 405 

Pauperism, m England and Wales, 
table, 93 , diagrams, 92, 113 , cal- 
culation of mean. 111 , of median, 
117, 118, means, medians, and 
modes for other years, 122 , stan- 
dard-deviation, 138-140 , mean 
deviation, 145-146 , quartiles, 
148 , percentiles, 151-152 

— correlation with out-relief, 182- 
185 , with earmngs and out-relief. 


177-181, (qu 2) 189, 239-241, 
245-247 , with out-relief, propor- 
tion of aged, etc , 192-195, 241- 
245. 

Pearl, Raymond, normal distribu- 
tion of number of seeds m Nelum- 
bium, 306 Refs , probable errors, 
355 , errors in variety tests, 402 ; 
Introduction to Medical Biometry, 
405. 

Pearson, E. S , refs , polycboric 
coefficients, 390, goodness of fit, 
397, pi obable errors, 398, 399,400- 

Pearson, Karl, contingency, 63, 65 ; 
mode, 120 , standard-deviation, 
144 , coefficient of variation, 149 ; 
skewness, 149 , inheritance of 
fertility, 195, spurious correla- 
tion between indices, 215 , bi- 
nomial apparatus, 299 , deduction 
of normal curve, 303 ; data cited 
from, 70, 78, 90, 96, 122, 160, 161. 
Refs , correlation of characters not 
quantitatively measurable, 40, 
333 , contingency, etc , 72-73, 
333, 390, 397 , frequency-curves, 
105, 130, 154, 273, 289, 314, 315, 
354, 395 , binomial distribution 
and machme, 314 , hypergeomet- 
rical senes, 289 , dissection of 
compound noimal curve, 315 , 
calculation of moments, 225 , 
general methods of curve fittmg, 
209 , testing fit of theoretical to 
actual distribution, 315, 391, 396 , 
correlation and correlation-ratio, 
188, 209, 225, 252, 333, 390, 391, 
392, 393, 397 , fittmg of prmcipal 
axes and planes, 209, 333 , corre- 
lation between mdioes, 226 , 
inheritance of fertility, 226 , 
weighted mean, reproductive se- 
lection, 226 , probable errors, 355, 
394, 397, 398, 399, 401 , tables 
for statisticians, 358, 401 , tables 
of Gamma functions, 401 , poly- 
chorio coefficients of correlation, 
390 , variate difference method, 
392. 

Peas, applications of theory of sam- 
pling to experiments in crossings 
267-268 

Pecten, correlation between two 
diameters of shell, 158 , con- 
stants, (qu 3) 189 



430 


THEORY OF STATISTICS 


Pepper, J , refs , sampling, 399 
Percentage, standard error of, 256- 
257 , when numbers in samples 
vary, 264-265 See also Sam- 
plmg of attributes 

Percentiles, 150-153 , def , 150 , de- 
termination, 151-152 , advantages 
and disadvantages, 152-153 , use 
for unmeasured characters, 152- 
153, refs , 333 , standard errors 
of, 337-341 , coi relation between 
errors of sampling in, 341-342, 
refs , 154, 354-356 
Perozzo, L , ref,, applications of 
theory of probability to correla- 
tion of ages at marriage, 314 
Persons, W M , refs , index- 
numbers, 391 

Petals, of Banunculus bulbosns, fre- 
quency of, 102 , unsuitability of 
median in case of such a distribu- 
tion, 117 

Peteis, J., refs , multiplication table, 
358 

Petty, Sir W , refs , Econoimc 
Wntwgs, 6 

Pickering, S U., refs , eiiors of agri- 
cultural experiment, 401 
Piaut, H , refs , Anweyidnngen da 
math SfatistiL avf P) obi erne da 
Massenfabnlation, 404 
Pomcare, H , refs , Calcul des gnob- 
abtliteSj 361 

Poisson, S D , law of small chances, 
368, 369 , refs., sex-ratio, 273 , 
generally and applications, 394 , 
Becherches sur la prohabihte des 
gugements, 273, 361 
Poppies, stigmatic rays on, fre- 
quency, 78 , unsuitability of 
median in such a distribution, 
116 

Population, estimation of, between 
censuses, 125-126 , refs , 130, 
253 

Positive classes and attributes, def., 
10 , number of positive classes, 
13 , sufficiency of, for tabulation, 
13 , expression of other fre- 
quencies, m terms of, 13-14. 
Poyntmg, J H., correlation of fluc- 
tuations, 201 , refs , 208 
Precision, 144, 257, 304. 

Pretonus, S J , refs., skew frequency 
surfaces, 397. 


Prices, mdex-numbers of, 126 , use 
of geometric mean, 126 ; of har- 
monic mean, 129 , lefs , 130-131, 
390-391 

Principal axes, in correlation, 321- 
322 , ref , 333 

Probability, theoiy of, woiks on, 
refs , 361, 404-407 

Qttabtile deviation See Quartiles 

Quai tiles, quartilo deviation and 
semi-intei quartile lange, 134, 
generally, 147-149 , defs , 147 , 
deteimmation, 147-148 , latio of 
q d to standard-deviation, 148, 
310 , advantages of q d as a 
measure of dispeision, 148-149, 
difference between deviations of 
quai tiles from median as measure 
of skewness, 149-150 , ratio of 
q d to median as measure of re- 
lative dispersion, 149 , q d of 
noimal curve, 310 , standard 
eiiors, 337-341, 341-343, refs, 
354-356 

Quctelot, L A J , refs , Leif yes sw la 
thconc des p obahilith, 272, 361 

Random sampling, m senso of simple 
sampling, 289 

Range, unsuitability of, as a mcasuie 
of dispersion, 133 

Banks, 143, 153 , methods of corie- 
lation based on (refs ), 333. 

Banuncnlus^ fiequoncy ol petals, 
102 ; unsuitability of median for 
such distiibutions, 117. 

RegivStrar-Genotal • correction or 
standardisation of death-rates. 
224, refs , 226, 392 , estimates 
of population, refs , 130 , data 
cited from Reports, 32-33, 52-53, 
77, 98, 163, 197-199, 199-201, 
222, 263, 283, 284, 285-286. 

Regressions, geneially, 175-177 , 
def., 175 , total and partial, 233 , 
standard errors of, 352 , non- 
linear, 201-202, 205-206, 352, 
direct deduction, 365-366 ; refs , 
208-209, 391, 392, 393, 394 

Relative dispersion, 149 

Reserves and discounts in American 
banks, correlation, 162 , diagram, 
facmg 166. 



INDEX. 


431 


Bhind, A , ref , tables for computing 
probable errors, 355, 359 
Rhodes, E C , refs , fitting poly- 
nomials, 393 , samplmg, 394, 
396 , law of error, 395 , samphng, 
398, 399 

Rider, P R , refs , samplmg, 399, 400 
Rietz, H. L , refs , frequency dis- 
tributions, 395 , Handbooh of 
Mathematical Statistics, 405 , 
Mathematical Statistics, 405 
Ritchie, E D , lefs , agionomie ex- 
periment, 403 

Ritchie-Scott, A., refs , correlation 
of polychonc table, 390 
Robmson, G , refs , Calculus of 
Obseivations, 405 

Robinson, G W , lefs , error m soil 
suiveys, 402 

Roemer, T , refs , yield trials, 403 
Romanovsky, V , refs , frequency- 
cuives, 395 , samphng, 399 
Ross, Sir R , refs , frequency-curves 
(epidemiology), 396 
Runge, I , lefs , Amvenduugen der 
math StatistiL auf Piobleme der 
Massenfabnlation, 404 
Russell, E J , lefs , errors of agri- 
cultmal experiment, 401 
Russell, W T., lefs , Medical Sta- 
tistics, 407 

Rutherford, E , ref , law of small 
chances, 273 

Salisbuey, E S , lefs , correlation, 
393, 394 {see Kelley) 

Salvosa, L R , lefs , fiequency-dis- 
tiibutions (tables), 395 
Samphng, theoiy of, generally, 254- 
355 , the pioblem, 254-256 , refs , 
273, 289, 313-315, 354-356, 392, 
393, 394-401 

— of attiibutes conditions as- 
sumed m simple sampling, 255- 
256, 259-262 , random in sense of 
simple samphng, 289 , standard- 
deviation of number oi proportion 
of successes m n events, 256-257, 
299-300 , examples from artificial 
chance, 258-259 , application to 
sex -1 atio, 262-264 , when num- 
beis m samples vary, 264-265 , 
when chance of success or failure 
is small, 265-266, 366-370 , stan- 
dard erroi, def , 267 , compaimg 


a sample with theory, 267-268 ; 
comparing one sample with an- 
other mdependent therefrom, 268- 

271 , comparing one sample with 
another combmed with it, 271- 

272 , hmitations to mterpretation 
of standard error when n is small, 
inverse interpretation, 276-279 , 
limits as a measure of untrust - 
worthmess, 279-281 , effect of 
removmg conditions of simple 
sampling, 281-289 , samplmg 
fiom limited material, 287, bi- 
nomial distribution, 291-300 , nor- 
mal curve, 300-313 , normal cor- 
relation, 317-334, law of small 
chances, 366-370, refs , 272-273, 
393, 395, 397-401 See also 
Binomial senes ; Hypergeometri- 
cal series, Normal curve, Cor- 
relation, normal 

Samplmg of variables, conditions 
assumed m simple samplmg, 335- 
337 , standard errors of percen- 
tiles (median and quartiles), 337- 
341 , dependence of standard 
erior of median on the form of the 
distribution, 338-340 , of differ- 
ence between two percentiles, 
341-343 , of arithmetic mean, 
344-350 , of diifezence between 
two means, 345-346 , normahty 
of distribution of mean, 346-347 ; 
effect of removing conditions of 
simple samplmg on standaid error 
of mean, 347-350 , standard error 
of standard - deviation and co- 
efficient of variation, 351 , of co- 
efficients of correlation and re- 
gression, 352 , of correlation-ratio 
and test for hnearity of regression, 
352 , refs , 354-356, 397-401. 

Sanders, H G , refs , uniformity 
tiials, 403 

Saunders, Miss E R , data cited 
Irom, 37 

Scale -1 eading, bias in, 362-363 

Scarborough, J B , refs , N umencal 
Mathematical Analysis, 406 

Scheibner, W , difference between 
arithmetic and geometric, arith- 
metic and harmonic means, (qu 8 
and qu 9) 156 

Scripture, E W , use of word 
“ statistics,” 3 



432 


THEORY OF STATISTICS. 


Secrist, H , refs , Introduction to 
Statistical Methods, 405 
Semi-mterquartile range. See Qiiar- 
tiles 

Sex-ratio of births * correlation with 
total births, 163, 175, 207 , dia- 
gram, 176 , constants, (qu 3) 189 , 
application of the theoiy of sam- 
phng to, 262-264, (qu 7) 275, (qu 
1, 2) 289, refs , 273 ; standard 
err 01 of latio of male to female 
biiths, (qu 11)275 
Shakespeaie, W , use of word 
“ statist,” I. 

Sheppard, W. E , correction of the 
standard-deviation for grouping, 
212, 307 , theorem on coi relation 
of a normal distribution grouped 
round medians, (qu 4) 334 ; 
normal curve tables, 337 , stan- 
dard errors of percentiles, 344 
Refs., calculation and correction 
of moments, 225 , normal curve 
and con elation, theory of sam- 
pling, 314, 333, 355 , tables of 
normal function and its integral, 
359 , goodno.ss of fit, 397 
Shewhait, W A , refs , Enqineenng 
Applications of Statistical Method, 
404 

Shohat, J (Chokhate, J.), refs., 
sampling, 399 
Significant diffeiences, 266. 

Simpson, T. Wake, refs , yield trials, 
403 

Sinclair, Sir John, use of words 
“ statistics,” “ statistical,” 2 
Sipos, A , refs , time senes, 393 
Skew or asymmetrical frequency- 
distributions, 90-102 See also 
Frequency-distributions. 

Skewness of frequency-distnbutions, 
107 , measures of, 149-150 
Slutsky, E , refs , fit of regression 
Imes, 209, 391, 

Small chances, law of, 265-266, 366- 
370 , refs , 273, 394. 

Smith, B B , refs , time correlation, 
393 

Smith, 0. J) , refs , Tchebychefi m- 
equahties, 400 

Snow, E G , refs , estimates of popu- 
lation, 130, 253 , lines and planes 
of closest fit, 209 
Soil surveys, eriois m, refs , 402 


Soper, H E , refs , probable error 
of correlation coefficient, 355, 
397, 399 , of biserial expression 
for correlation - coefficient, 355 ; 
Frequency Air ays, 395 ; samphng, 
397, 399, 400 ; tables of ex- 
ponential binomial limit, 273 

Southey, Robert, cited le Gosin’s 
Names of the Roman Catholics, 
etc , 100 

Spearman, C , effect of errors of 
obseivation on the standard- 
deviation and coefficient of corre- 
lation, 213-214 Refs., effect of 
errors of observation, 225, 333, 
392 , rank method of correlation, 
333, 397. 

Splawa-Neyman, J,, refs , probable 
errors, 398. 

Spurious correlation of indices, 215- 
216 ; refs , 226, 392. 

Standard-deviation. See Deviation, 
standard 

Standardisation of death-rates, 223- 
225 , rofa , 226, 392. 

Statist, occurrence of the word in 
Shakespeaie and in Milton, 1 

Statistical, introduction and de- 
velopment m the moaning of the 
word, 1-5; S Account of Scotland, 
2 , Royal S Society, 3 , methods, 
purport of, 3-5 ; def., 5 

Statistics, introduction and develop- 
ment m meaning of word, 1-5 , 
def , 5 , theory of, def , 5 

Statures of males in U K,, tables, 88, 
90 , diagrams, 89, 91 ; calcula- 
tion of moan, 112 ; means and 
medians, 117, (qu 1)131, stan- 
dard-deviation, 141 , percentiles, 
153 ; standard-deviation, mean 
deviation, and quartiles, (qu. 1) 
156 , distribution fitted to normal 
curve, 305-306, 307-308 ; dia- 
gram, 306 , standard errors of 
mean and median, of first to 
nmth deciles, 341, 343, 344- 
345 , of standard-deviation and 
senn-interquartile range, (qu 5) 
355. 

— correlation of, for father and 
son, 160 , diagrams, facing 166, 
174 , constants, (qu 3) 189 , 
testing for normality, 322-328 ; 
for isotropy, 329-331 , diagram 



IKDEX. 


433 


of diagonal distnbntion, 325, of j 
fitted contour lines, 327 
Stead, H. G , correlation-coefficients, 
ref , 392. 

Steffensen, J F , refs , Becent Be- 
searches^ 407. 

Stevenson, T H. 0 , refs , birth- 
rates, correction of, for age- 
distribution, 226 

Stigmatic rays on poppies, fre- 
quency, 78 , unsuitability of 
median for such distributions, 116 
Stirbng, James, expression for fac- 
torials of large numbers, 304 
Stoessiger, B , refs , probability 
mtegrals for small samples, 401. 
Stratton, F J M , refs , errors of 
agricultural experiment, 402. 

“ Student ” (pseudonym), refs., law 
of small chances, 273, 394 ; prob- 
able errors, 356, 397, 398 (under 
Fisher, R A ) , deviations from 
Poisson’s Law, 394 ; probable 
errors of Spearman’s correlation- 
coefficients, 397 , method of 
cereal testing, 402 

Surface, F M , refs , errors m variety 
tests, 402 

Symmetrical frequency - distribu- 
tions, 87-90 Bee also Frequency- 
distributions , Normal curve 
Symons, G J , use of word “ sta- 
tistics ” m British Bainfall, 3. 

Tables, calculating, of functions, 
etc , refs , 357-359, 401 , see also 
under subject-headings 
Tabulation, of statistics of attri- 
butes, 11-14, 37 , of a frequency- 
distribution, 81-83 , of a correla- 
tion table, 164 

Tappan, M , refs , partial correlation, 
394 

Tatham, John, refs , standardisation 
of death-rates, 226. 

Tchebychefi, P L Bee Chebyshefi 
Tedin, 0 , refs , yield trials, 404 
Thiele, T N , refs , The Theory of 
Observations, 394 

Thomson, G H., refs , The Essentials 
of Mental Measurement, 404. 
Thorndike, E L , refs , methods 
of measuring correlation, 333 , 
Theory of Mental and Social 
Measuievients, 361. 


Time-correlation problem, 197-201 ; 
refs , 208-209, 392-393. 

Tippett, L. E C , refs., extremes of 
sample, 399 , The Methods of 
Statistics, 407 

Tocher, J F , refs , contmgeney, 390. 

Todhunter, I , refs , Eistory of the 
Mathematical Theory of Broba- 
hility, 6 

Trachtenberg, M I., refs., property 
of median, 154. 

Trought, Trevor, refs., cotton tnals, 
402 

Tschebyshefi, P. L. See Chebysheff 

Tschuprow, A A , refs , partial 
correlation, 394 , mathematical 
expectation of moments, 397 , 
distribution of means, 398 ; Korre- 
lations-theorie, 405 

Type of array, def., 164 

Ultimate classes and frequencies, 
def , 12 ; sufficiency of, for tabu- 
lation, 12-13 

Universe, def., 17 ; specification of, 
17, 18 

U-sbaped frequency - distributions, 
102-105. 


Value, annual, of dweUing-houses, 
table, 83 , median, (qu 4) 131 ; 
quartiles, (qu 3) 155. 

— of estates m 1715, table, 100 ; 
diagram, 101. 

Variables, theory of, generally, 75— 
253 ; def , 7, 75. 

Variance, for square of standard 
deviation, 144 ; refs , analysis of 
variance, Irwin, 394 , Tippett, 
407. 

Variates, def , 150. 

Variation, coefficient of, 149 , stan- 
dard error of, 351, 352 

Variety trials, errors in, refs , 401- 
404 

Venn, John, refs , Logic of Chance, 
sex-ratio, 273, 361 

Verschaefielt, E , relative dispersion, 
149 Eefs , measure of relative 
dispersion, 154 

Vigor, H D , data cited from, 163 
Refs , sex-ratio, 273. 

Vik, Knut, refs , yield trials, (under 
Behrens) 403, 




434 


THEORY OP STATISTICS. 


Wages of agnoultural labourers. 

See Earnings. 

Wages, real, refs , 390-391 
Walker, Helen M , refs , Biatory of 
Statistical Method, 390. 

Warner, P., refs , study of defects m 
school childien, notation for sta- 
tistics of attributes, 15 
Water analysis, methods, refs., 394 
Waters, A C , lefs , estimating 
inteicensal populations, 130 
Weathei and crops, coi relation, 196- 
197 , refs., 208 

Weight of males in U K , table, 95 , 
diagram, 94 , mean, median, and 
mode, (qu 2) 131 ; standard 
deviation, mean deviation, and 
quartiles, (qu 2) 155 
Weighted mean See Mean, 
weighted , also Mean, geometiic , 
Medan, Mode 

Weldon, W P R , dice-throwing 
experiments, 258-259, 373-376 
West, C J , refs., InUoduction to 
Mathematical Statistics, 405 
Westeigaard, H, refs, Theone der 
Statistil, 6, 273, 361, 406. 
Whipple, G C , lets , Vital Statistics, 
406 

Whitaker, Lucy, ref , law of small 
numbers, 273 

Whittakei, E T , lefs , Calculus of 
Observations, 405 

Wicksell, S D , lefs , correlation, 
391, 392 

Will, H. IS , refs , curve-fitting, 393 
WiUcox, W. P , citation of Bielfeld, 
1 

Winkler, Wilhelm, refs , Qnmdnss 
der Statistih, 407. 

Wishart, John, refs , sampling, 399, 
400, 401 , agricultural experi- 
ment, 403, 404 

Wolfenden, H H , ref , mortalities 
and death-rates, 392 
Woo, T L , refs , sampling, 400 
Wood, Prances, lefs , index-correla- 


tions, 226, 252, index-numbers, 
390 

Wood, T. B , refs., errors of agricul- 
tuial experiment, 401, 402 , varia- 
tion in mangels, 402 
Woods, Hilda M., refs , Medical 
Statistics, 407 

Working, H., lefs , time series, 393 
Woiking classes, cost of living, refs , 
390-391. 


Yield trials, refs , 401-404 

Young, Allyn A , refs , age statis- 
tics, 105 

Young, Andrew, refs , probable error 
of coefficient of contingency, 397 

Yule, G. U , use of term character- 
istic lines (lines of regiession), 177 ; 
problem of pauperism, 192 , data 
cited from, 78, 93, 122, 140, 163, 
185 , facing 186, 259, 385. Refs., 
history of words “ statistics,” 
“ statistical,” 5 ; attributes, asso- 
ciation, consistence, etc , 15, 23, 
39, 40, 57 ; isotropy, mfluonco of 
bias in statistics of qualities, 73 , 
correlation, 188, 226, 252, 392, 
correlation between indices, 226 ; 
frequency- curves, 314, 396 , prob- 
able errors, 355, 396 , pauperism, 
130, 208, 253 , birth-rates, 208, 
226 , sex-ratio, 273 , fluctuations 
of sampling in Mondohan ratios, 
273 , timo-cori elation problem, 
392 ; application of law of small 
chances, 391, goodness of fit in 
association and contingency 
tables, 396 ; yield tiials, 402. 

ZiMMEBMANN, E A W., use of the 
words “ statistics,” “ statistical,” 
in English, 1 

Zimmermann, H , multiplication 
table, 358 

Zizek, P’ , refs , Die statist ischen Mit- 
telwerthe and translation, 120 


PRINTED IN GREAT BRITAIN BY NEILL AND CQ , LTD , EDINBURGH 



